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(Rev. 10-96) 

TRANSMITTAL LETTER TO THE UNITED STATES 
DESIGNATED/ELECTED OFFICE (DO/EO/US) 
CONCERNrNG A FILING UNDER 35 U.S.C. 371 



ATTORNEY'S DOCKET NUMBER 

030708-035 



3. APPLICATION NO. (I 



09/403 724 



INTERNATIONAL APPLICATION NO. 

PCT/IB98/00625 



INTERNATiONAL FILING DATE 

24 April 1998 



PRIORITY DATE CLAIMED 

26 April 1997 



TITLE OF INVENTION 
MEUROTRYPSIN 



APPLICANT(S) FOR DO/EO/US 

^3ter Sonderegger 



Applicant herewith submits to the United States Designated/Elected Office (DO/EO/US) the following items and other information: 

1 . Q This is a FIRST submission of items concerning a filing under 35 U.S.C. 371 . 

2. □ This is a SECOND or SUBSEQUENT submission of items concerning a filing under 35 U.S.C. 371. 

3. H This is an express request to begin national examination procedures (35 U.S.C. 371 (f)) at any time rather than delay examination 

until the expiration of the applicable time limit set in 35 U.S.C. 371(b) and the PCT Articles 22 and 39(1). 

4. CH A proper Demand for International Preliminary Examination was made by the 19th month from the earliest claimed priority date. ' 

5. Q A copy of the International Application as filed (35 U.S.C. 371(c)(2)) ' 

a. □ is transmitted herewith (required only if not transmitted by the International Bureau). 

b. □ hash een transmitted by the International Bureau. 

« c. n is not required, as the application was filed in the United States Receiving Office (RO/US) 

A translation of the International Application into English (35 U.S.C. 371 (c)(2)). 



7 I — I Amendments to the claims of the International Application under PCT Article 1 9 (35 U.S.C. 371(c)(3)) 

a. CH are transmitted herewith (required only if not transmitted by the International Bureau). 

b. □ have been transmitted by the International Bureau. 

c. EH have not been made; however, the time limit for making such amendments has NOT expired. 

d. □ have not been made and will not be made. 

A translation of the amendments to the claims under PCT Article 19 (35 U.S.C. 371(c)(3)). 
9 □ An oath or declaration of the inventor(s) (35 U.S.C. 371 (c)(4)). 

10. dl A translation of the annexes to the International Preliminary Examination Report under PCT Article 36 (35 U.S.C. 371 (c)(5)). 
Items 1 1. to 16. below concern other document(s) or information included: 

11. □ 

' □ 

□ 

14. □ 

15. □ 

16. □ 



An Information Disclosure Statement under 37 CFR 1 .97 and 1 .98. 

An assignment document for recording. A separate cover sheet in compliance with 37 CFR 3.28 and 3.31 is included. 

A FIRST preliminary amendment. 

A SECOND or SUBSEQUENT preliminary amendment. 

A substitute specification. 

A change of power of attorney and/or address letter. 
Other items or information: 
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ATTORNEY'S DOCKET NUMBER 



17. Q The following fees are submitted: 


CALCULATIONS 




Basic National Fee (37 CFR 1.492(a)(1)-(5)): 

Search Report has been prepared by the EPO or JPO $840.00 (970) 

International preliminary examination fee paid to USPTO (37 CFR 1.482) 

No international preliminary examination fee paid to USPTO (37 CFR 1 .482) 

but international search fee paid to USPTO (37 CFR 1 .445(a)(2)) $760.00 (958) 

Neither international preliminary examination fee (37 CFR 1 .482) nor 

international search fee (37 CFR 1.445(a)(2)) paid to USPTO S970.00 (960) 

International preliminary examination fee paid to USPTO (37 CFR 1 .482) 

ENTER APPROPRIATE BASIC FEE AMOUNT = 




S 970.00 




Surcharge of $130.00 (154) for furnishing the oath or declaration later than 20 CH 30 
months from the earliest claimed priority date (37 CFR 1 .492(e)). 


$ 0.00 




Claims 


Number Filed 


Number Extra 


Rate 




Total Claims 


1 5 -20 = 


0 


X$ 18.00 
(966) 


$ 0.00 




Independent Claims 


14-3 = 


11 


X$78.00 
(964) 


S 858.00 




Multiple dependent claim(s) (if applicable) 


-1- $260.00 
(968) 


$ 0.00 




TOTAL OF ABOVE CALCULATIONS = 


$ 1,828.00 




Reduction for 1/2 for filing by small entity, if applicable. Verified Small Entity statement must also be 
filed. (Note 37 CFR 1 .9, 1 .27, 1 .28). 


$ 




SUBTOTAL = 


$ 




..Processing fee of $130.00 (1 56)for furnishing the English translation later than 20 [U 30 
..months from the earliest claimed priority date (37 CFR 1 .492(f)). + 


$ 




TOTAL NATIONAL FEE = 


$ 1,828.00 




:*ee for recording the enclosed assignment (37 CFR 1 .21(h)). The assignment must be accompanied 
by an appropriate cover sheet (37 CFR 3.28, 3.31). per property + 


$ 




TOTAL FEES ENCLOSED = 


$ 1,828.00 






Amount to be: 
refunded 


$ 


charged 


$ 
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Q A check in the amount of S 1 .828.00 



b. im Please charge my Deposit Account No. 02-4800 in the 
enclosed. 



coveir the above fees is enclosed. 

of $ to cover the above fi 
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A duplicate copy of this sheet i: 



Q The Commissioner is hereby authorized to charge any additional fees w/hich may be required, or credit any overpayment to Deposit 
Account No. 02-4800 . A duplicate copy of this sheet Is enclosed. 



NOTE: Where an appropriate time limit under 37 CFR 1 .494 or 1 .495 has not been n 
filed and granted to restore the application to pending status. 



SEND ALL CORRESPONDENCE TO: 



a petition to revive (37 CFR 1.137(a) or (b)) must be 



William L. Mathis 

BURNS, DOANE, SWECKER & MATHIS, L.L.P. 
P.O. Box 1404 

Alexandria, Virginia 22313-1404 



;ruce J. Boggs, Jr. 



NAME 

32,344 

REGISTRATION NUMBER 
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Patent 

Attorney's Docket No. 030708-035 

IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re Patent Application of 
Peter SONDEREGGER 
Serial No.: 09/403,724 ( 
Filed: October 26, 1999 
For: NEUROTRYPSIN 

TRANSMITTAL LETTER FOR MISSING PARTS OF APPLICATION 

Assistant Commissioner for Patents 
Washington, D.C. 20231 

Sir: 

In complete response to the Notice to Comply with Requirements for Patent 
Applications Containing Nucleotide Sequence and/or Amino Acid Sequence disclosures 
dated not yet received , enclosed please find: 

[X] A copy of the "Sequence Listing" in computer readable form in 
compliance with 37 C.F.R. §§1. 823(b) and 1.824. 

[X] A statement that the content of the paper and computer readable copies 
are the same as set forth in 37 C.F.R. §1. 821(f). 

The Commissioner is hereby authorized to charge any additional fees under 
37 C.F.R. §§1.16, 1.17, and 1.21 that may be required by this paper, and to credit any 
overpayment to Deposit Account No. 02-4800. A duplicate copy of this paper is enclosed. 

Respectfully submitted, 

BURNS, DOANE, SWECKER & MATHIS, L.L.P. 

... C 2J:z- 

Richard C. Ekstrom 
Registration No. 37,027 




Group Art Unit: Unknown 
Examiner: Unknown 
ATTENTION: BOX SEQUENCE 



1737 Kmg Street, Suite 500 
Alexandria, VA 22314-2756 
(703) 836-6620 

Date: |>ec#vH-U^ 2^)^(1^^ 



Patent 

Attorney's Docket No. 030708-035 

Applicant or Patentee: Peter Sondereaaer 

Application or Patent No.: 

Filed or Issued: October 26, 1999 

For: NEUROTRYPSIN 

VERIFIED STATEMENT (DECLARATION) CLAIMING SMALL ENTITY 
STATUS {37 C.F.R. §§ 1.9(f) AND 1.27(b)) - INDEPENDENT INVENTOR 

As a below-named inventor, I hereby declare that I qualify as an independent inventor as defined 
in 37 C.F.R. § 1.9(c) for purposes of paying reduced fees under Sections 41(a) and 41(b) of Title 
35, United States Code, to the Patent and Trademark Office with regard to the invention entitled 
Neurotrypsin described in: 

[ ] the specification filed herewith 

[X] Application No. , filed October 26, 1999 . 

[ ] Patent No. , issued . 



I have not assigned, granted, conveyed, or licensed and am under no obligation under contract or 
law to assign, grant, convey, or license any rights in the invention either to any person who could 
not be classified as an independent inventor under 37 C.F.R. § 1 .9(c) if that person had made the 
invention, or to any concern that would not qualify as either a small business concern under 
37 C.F.R. § 1.9(d) or a nonprofit organization under 37 C.F.R. § 1.9(e). 

Each person, concern or organization to which I have assigned, granted, conveyed, or licensed or 
am under an obligation under contract or law to assign, grant, convey, or license any rights In the 
invention is listed below: 

[X] no such person, concern, or organization 

[ ] persons, concerns, or organizations listed below* 

*NOTE: Separate verified statements are required from each named person, 
concern, or organization having rights to the invention averring to their status as 
small entities. (37 C.F.R. § 1.27.) 





FULL NAME 


[ ] individual 


[ ] small business concern 


[ ] nonprofit organization 




FULL NAME 


[ ] individual 


[ ] small business concern 


[ ] nonprofit organization 


AnnRFSR 




[ ] individual 


[ ] small business concern 


[ I nonprofit organization 



I acknowledge the duty to file, in this application or patent, notification of any change in status 
resulting in loss of entitlement to small entity status prior to paying, or at the time of paying, the 
earlier of the issue fee or any maintenance fee due after the date on which status as a small entity 
is no longer appropriate. (37 C.F.R. § 1.28(b).) 
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Application No. 

Attorney's Docket No. 030708-035 



I hereby declare that all statements made herein of my own knowledge are true and that all 
statements made on information and belief are believed to be true; and further that these 
statements were made with the knowledge that willful false statements and the like so made are 
punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States 
Code; and that such willful false statements may jeopardize the validity of the application, any 
patent issuing thereon, or any patent to which this verified statement is directed. 



Name Peter SnnderpgnRr 

Signature /• /{A-^-^U ^ Date /C' ^ ^ (('1 ^1 
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Patent 

Attorney's Docket No. 030708-035 
IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 



In re Patent Application of 
Peter SONDEREGGER 
Serial No.: 09/403,724 
Filed: October 26, 1999 
For: NEUROTRYPSIN 




Group Art Unit: Unassigned 
Examiner: Unassigned 
ATTENTION: BOX SEQUENCE 



PRELIMINARY AMENDMENT 



Assistant Commissioner for Patents 
Washington, D.C. 20231 



Prior to examination on the merits, please amend the above-identified 
application as follows: 

IN THE SPECIFICATION : 

In compliance with 37 C.F.R. § 1.823(a), please delete pages 16-32 of the 
specification and insert therefor the attached paper copy of the "Sequence Listing" between 
page 15 of the Disclosure and the first page of the Claims to replace the Sequence Listing 
identified thereon. 



Serial No. 09/403,724 



REMARKS 

The paper copy of the Sequence Listing for the subject apphcation, is by this 
amendment added between page 15 of the Specification and the first page of the Claims to 
replace the Sequence Listing identified thereon. Please amend the page numbers 
accordingly. 

Favorable consideration on the merits is respectfully requested. 

Respectfully submitted, 

BURNS, DOANE, SWECKER & MATHIS, L.L.P. 

By 7^y^/^ ^^t^^^^^^ 
Richard C. Ekstrom 
Registration No. 37,027 

P.O. Box 1404 

Alexandria, Virginia 22313-1404 
(703) 836-6620 

Date: December 20, 1999 
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SEQUENCE LISTING 



<110> 



SONDEREGGER, Peter 



<120> 



NEUROTRYPSIN 



<150> 



<130> 



<141> 



<140> 



PCT/IB98/00625 



030708-035 



09/403, 724 
1999-10-26 




<151> 1998-04-24 

<150> CH 0966/97 
<151> 1997-04-26 

<160> 28 

<170> Patentin Ver . 2.0 

<210> 1 
<211> 3350 
<212> DNA 

<213> Homo sapiens 
<220> 

<22l> sig_peptide 
<222> (44) . . (103) 

<220> 

<221> mat_peptide 
<222> (104) . . (2668) 

<220> 
<221> CDS 

<222> (44) . . (2668) 
<220> 

<221> 5'UTR 
<222> (1) . . (43) 

<220> 

<221> 3'UTR 

<222> (2669) . . (3350) 

<400> 1 

cggaagctgg ggagcatgga ccagaccccg cagcgctggc ace atg acg etc gcc 55 

Met Thr Leu Ala 
-20 

cgc ttc gtg eta gcc ctg atg tta ggg gcg etc ccc gaa gtg gtc ggc 103 
Arg Phe Val Leu Ala Leu Met Leu Gly Ala Leu Pro Glu Val Val Gly 
-15 -10 -5 -1 



- 1 - 



ttt gat tct gtc etc aat gat tec etc cac cac age cac cgc cat teg 
Phe Asp Ser Val Leu Asn Asp Ser Leu His His Ser His Arg His Ser 



ccc cot gcg ggt ceg cac tae cec tat tac ctt ccc ace cag cag egg 
Pro Pro Ala Gly Pro His Tyr Pro Tyr Tyr Leu Pro Thr Gin Gin Arg 



ccc ccg aeg acg cgt ecg ceg ceg ect etc ceg cgc ttc ccg cgc ccc 
Pro Pro Thr Thr Arg Pro Pro Pro Pro Leu Pro Arg Phe Pro Arg Pro 



ccg egg gcg etc cct gee cag cgc ccg cac gee etc cag gee ggg cac 
Pro Arg Ala Leu Pro Ala Gin Arg Pro His Ala Leu Gin Ala Gly His 



aeg ccc egg ecg cac ccc tgg ggc tgc ccc gee ggc gag eca tgg gtc 
Thr Pro Arg Pro His Pro Trp Gly Cys Pro Ala Gly Glu Pro Trp Val 



age gtg aeg gac ttc ggc gee ecg tgt ctg egg tgg gcg gag gtg eca 
Ser Val Thr Asp Phe Gly Ala Pro Cys Leu Arg Trp Ala Glu Val Pro 



ccc ttc ctg gag egg teg ccc eca gcg age tgg get cag ctg cga gga 

Pro Phe Leu Glu Arg Ser Pro Pro Ala Ser Trp Ala Gin Leu Arg Gly 
100 105 110 

cag cgc cac aac ttt tgt egg age ccc gac ggc gcg ggc aga ccc tgg 

Gin Arg His Asn Phe Cys Arg Ser Pro Asp Gly Ala Gly Arg Pro Trp 
115 120 125 

tgt ttc tac gga gac gee cgt ggc aag gtg gac tgg ggc tac tgc gac 

Cys Phe Tyr Gly Asp Ala Arg Gly Lys Val Asp Trp Gly Tyr Cys Asp 
130 135 140 

tgc aga cac gga tea gta cga ctt cgt ggc ggc aaa aat gag ttt gaa 

Cys Arg His Gly Ser Val Arg Leu Arg Gly Gly Lys Asn Glu Phe Glu 

145 150 155 160 

ggc aca gtg gaa gta tat gca agt gga gtt tgg ggc act gtc tgt age 

Gly Thr Val Glu Val Tyr Ala Ser Gly Val Trp Gly Thr Val Cys Ser 

165 170 175 

age cac tgg gat gat tct gat gca tea gtc att tgt cac cag ctg cag 

Ser His Trp Asp Asp Ser Asp Ala Ser Val lie Cys His Gin Leu Gin 
180 185 190 

ctg gga gga aaa gga ata gca aaa caa acc ccg ttt tct gga ctg ggc 

Leu Gly Gly Lys Gly lie Ala Lys Gin Thr Pro Phe Ser Gly Leu Gly 
195 200 205 



ctt att cec att tat tgg age aat gtc cgt tgc cga gga gat gaa gaa 775 
Leu lie Pro lie Tyr Trp Ser Asn Val Arg Cys Arg Gly Asp Glu Glu 
210 215 220 



aat ata ctg ctt tgt gaa aaa gac ate tgg cag ggt ggg gtg tgt cct 

Asn lie Leu Leu Cys Glu Lys Asp lie Trp Gin Gly Gly Val Cys Pro 

225 230 235 240 

cag aag atg gca get get gtc acg tgt age ttt tec cat ggc eea acg 
Gin Lys Met Ala Ala Ala Val Thr Cys Ser Phe Ser His Gly Pro Thr 

245 250 255 

ttc ccc ate att ego ctt get gga ggc age agt gtg cat gaa ggc egg 

Phe Pro lie lie Arg Leu Ala Gly Gly Ser Ser Val His Glu Gly Arg 
260 265 270 

gtg gag etc tac cat get ggc cag tgg gga ace gtt tgt gat gac caa 

Val Glu Leu Tyr His Ala Gly Gin Trp Gly Thr Val Cys Asp Asp Gin 
275 280 285 

tgg gat gat gee gat gca gaa gtg ate tgc agg cag ctg ggc etc agt 

Trp Asp Asp Ala Asp Ala Glu Val lie Cys Arg Gin Leu Gly Leu Ser 
290 295 300 

ggc att gcc aaa gca tgg cat cag gca tat ttt ggg gaa ggg tct ggc 

Gly lie Ala Lys Ala Trp His Gin Ala Tyr Phe Gly Glu Gly Ser Gly 

305 310 315 320 

eea gtt atg ttg gat gaa gta cgc tgc act ggg aat gag ctt tea att 

Pro Val Met Leu Asp Glu Val Arg Cys Thr Gly Asn Glu Leu Ser lie 

325 330 335 

gag cag tgt eea aag age tec tgg gga gag eat aac tgt ggc eat aaa 

Glu Gin Cys Pro Lys Ser Ser Trp Gly Glu His Asn Cys Gly His Lys 
340 345 350 

gaa gat get gga gtg tec tgt acc cct eta aea gat ggg gtc ate aga 

Glu Asp Ala Gly Val Ser Cys Thr Pro Leu Thr Asp Gly Val lie Arg 
355 360 365 

ctt gca ggt ggg aaa ggc age eat gag ggt ege ttg gag gta tat tae 

Leu Ala Gly Gly Lys Gly Ser His Glu Gly Arg Leu Glu Val Tyr Tyr 
370 375 380 

aga ggc cag tgg gga act gtc tgt gat gat ggc tgg act gag ctg aat 

Arg Gly Gin Trp Gly Thr Val Cys Asp Asp Gly Trp Thr Glu Leu Asn 

385 390 395 400 

aca tac gtg gtt tgt cga cag ttg gga ttt aaa tat ggt aaa caa gca 

Thr Tyr Val Val Cys Arg Gin Leu Gly Phe Lys Tyr Gly Lys Gin Ala 

405 410 415 

tct gee aac eat ttt gaa gaa age aca ggg ccc ata tgg ttg gat gac 

Ser Ala Asn His Phe Glu Glu Ser Thr Gly Pro lie Trp Leu Asp Asp 
420 425 430 

gtc age tgc tea gga aag gaa acc aga ttt ctt cag tgt tec agg cga 

Val Ser Cys Ser Gly Lys Glu Thr Arg Phe Leu Gin Cys Ser Arg Arg 
435 440 445 



cag tgg gga agg cat gac tgc age cac cgc gaa gat gtt age att gcc 
Gin Trp Gly Arg His Asp Cys Ser His Arg Glu Asp Val Ser lie Ala 
450 455 460 

tgc tac cct ggc ggc gag gga cac agg etc tct ctg ggt ttt cct gtc 
Cys Tyr Pro Gly Gly Glu Gly His Arg Leu Ser Leu Gly Phe Pro Val 
465 470 475 480 

aga ctg atg gat gga gaa aat aag aaa gaa gga cga gtg gag gtt ttt 
Arg Leu Met Asp Gly Glu Asn Lys Lys Glu Gly Arg Val Glu Val Phe 
485 490 495 

ate aat ggc cag tgg gga aca ate tgt gat gat gga tgg act gat aag 
lie Asn Gly Gin Trp Gly Thr lie Cys Asp Asp Gly Trp Thr Asp Lys 
500 505 510 

gat gca get gtg ate tgt cgt cag ctt ggc tac aag ggt cct gee aga 
Asp Ala Ala Val lie Cys Arg Gin Leu Gly Tyr Lys Gly Pro Ala Arg 
515 520 525 

gca aga ace atg get tac ttt gga gaa gga aaa gga eee ate eat gtg 
Ala Arg Thr Met Ala Tyr Phe Gly Glu Gly Lys Gly Pro lie His Val 
530 535 540 

gat aat gtg aag tgc aca gga aat gag agg tec ttg get gae tgt ate 
Asp Asn Val Lys Cys Thr Gly Asn Glu Arg Ser Leu Ala Asp Cys lie 
545 550 555 560 

aag caa gat att gga aga cac aae tgc cgc cac agt gaa gat gca gga 
Lys Gin Asp lie Gly Arg His Asn Cys Arg His Ser Glu Asp Ala Gly 
565 570 575 

gtt att tgt gat tat ttt ggc aag aag gcc tea ggt aae agt aat aaa 
Val lie Cys Asp Tyr Phe Gly Lys Lys Ala Ser Gly Asn Ser Asn Lys 
580 585 590 

gag tec etc tea tet gtt tgt ggc ttg aga tta ctg cac cgt egg cag 
Glu Ser Leu Ser Ser Val Cys Gly Leu Arg Leu Leu His Arg Arg Gin 
595 600 605 

aag egg ate att ggt ggg aaa aat tet tta agg ggt ggt tgg cct tgg 
Lys Arg lie lie Gly Gly Lys Asn Ser Leu Arg Gly Gly Trp Pro Trp 
610 615 620 

cag gtt tec etc egg etg aag tea tee eat gga gat ggc agg etc etc 
Gin Val Ser Leu Arg Leu Lys Ser Ser His Gly Asp Gly Arg Leu Leu 
625 630 635 640 

tgc ggg get aeg etc etg agt age tgc tgg gtc etc aca gca gca cac 
Cys Gly Ala Thr Leu Leu Ser Ser Cys Trp Val Leu Thr Ala Ala His 
645 650 655 

tgt ttc aag agg tat ggc aae age act agg age tat get gtt agg gtt 
Cys Phe Lys Arg Tyr Gly Asn Ser Thr Arg Ser Tyr Ala Val Arg Val 
660 665 670 



gga gat tat cat act ctg gta cca gag gag ttt gag gaa gaa att gga 
Gly Asp Tyr His Thr Leu Val Pro Glu Glu Phe Glu Glu Glu lie Gly 
675 680 685 

gtt caa cag att gtg att cat egg gag tat cga ccc gac cgc agt gat 
Val Gin Gin lie Val lie His Arg Glu Tyr Arg Pro Asp Arg Ser Asp 
690 695 700 

tat gac ata gcc ctg gtt aga tta caa gga cca gaa gag caa tgt gcc 
Tyr Asp lie Ala Leu Val Arg Leu Gin Gly Pro Glu Glu Gin Cys Ala 
705 710 715 720 

aga ttc age age cat gtt ttg cca gcc tgt tta cca etc tgg aga gag 
Arg Phe Ser Ser His Val Leu Pro Ala Cys Leu Pro Leu Trp Arg Glu 
725 730 735 

agg cca cag aaa aca gca tec aac tgt tac ata aca gga tgg ggt gac 
Arg Pro Gin Lys Thr Ala Ser Asn Cys Tyr lie Thr Gly Trp Gly Asp 
740 745 750 

aca gga ega gee tat tea aga aca eta caa caa gca gcc att ccc tta 
Thr Gly Arg Ala Tyr Ser Arg Thr Leu Gin Gin Ala Ala lie Pro Leu 
755 760 765 

ctt cct aaa agg ttt tgt gaa gaa cgt tat aag ggt egg ttt aca ggg 
Leu Pro Lys Arg Phe Cys Glu Glu Arg Tyr Lys Gly Arg Phe Thr Gly 
770 775 780 

aga atg ctt tgt get gga aac etc cat gaa cac aaa cgc gtg gac age 
Arg Met Leu Cys Ala Gly Asn Leu His Glu His Lys Arg Val Asp Ser 
785 790 795 800 

tge cag gga gac age gga gga cca etc atg tgt gaa egg ccc gga gag 
Cys Gin Gly Asp Ser Gly Gly Pro Leu Met Cys Glu Arg Pro Gly Glu 
805 810 815 

age tgg gtg gtg tat ggg gtg ace tec tgg ggg tat ggc tgt gga gte 
Ser Trp Val Val Tyr Gly Val Thr Ser Trp Gly Tyr Gly Cys Gly Val 
820 825 830 

aag gat tet cct ggt gtt tat ace aaa gte tea gcc ttt gta cct tgg 
Lys Asp Ser Pro Gly Val Tyr Thr Lys Val Ser Ala Phe Val Pro Trp 
835 840 845 



ata aaa agt gte acc aaa ctg taattcttca tggaaacttc aaagcagcat 2 6 98 

lie Lys Ser Val Thr Lys Leu 
850 855 

ttaaacaaat ggaaaacttt gaacccccac tattagcact cagcagagat gacaacaaat 2 7 58 

ggcaagatct gtttttgett tgtgttgtgg taaaaaattg tgtacceect getgettttg 2 818 

agaaatttgt gaaeatttte agaggcetea gtgtagtgga agtgataatc ettaaatgaa 2878 



cattttctae cetaatttea ctggagtgac ttattctaag eeteatetat eeeetaeeta 2938 



tttctcaaaa tcattctatg ctgattttac aaaagatcat ttttacattt gaactgagaa 2998 
ccccttttaa ttgaatcagt ggtgtctgaa atcatattaa atacccacat ttgacataaa 3058 
tgcggtaccc tttactacac tcatgagtgg catatttatg cttaggtctt ttcaaaagac 3118 
ttgacaagaa atcttcatat tctctgtagc ctttgtcaag tgaggaaatc agtggttaaa 3178 
gaattccact ataaactttt aggcctgaat aggagtagta aagcctcaag gacatctgcc 32 3 8 
tgtcacaata tattctcaaa gtgatctgat atttggaaac aagtatcctt gttgagtacc 32 98 
aagtgctaca gaaaccataa gataaaaata ctttctacct acagcgtgcc eg 33 5 0 



<210> 2 
<211> 875 
<212> PRT 

<213> Homo sapiens 
<400> 2 

Met Thr Leu Ala Arg Phe Val Leu Ala Leu Met Leu Gly Ala Leu Pro 



Glu Val Val Gly Phe Asp Ser Val Leu Asn Asp Ser Leu His His Ser 

-11 5 10 

His Arg His Ser Pro Pro Ala Gly Pro His Tyr Pro Tyr Tyr Leu Pro 

15 20 25 

Thr Gin Gin Arg Pro Pro Thr Thr Arg Pro Pro Pro Pro Leu Pro Arg 

30 35 40 

Phe Pro Arg Pro Pro Arg Ala Leu Pro Ala Gin Arg Pro His Ala Leu 

45 SO 55 60 

Gin Ala Gly His Thr Pro Arg Pro His Pro Trp Gly Cys Pro Ala Gly 



Glu Pro Trp Val Ser Val Thr Asp Phe Gly Ala Pro Cys Leu Arg Trp 
80 85 90 

Ala Glu Val Pro Pro Phe Leu Glu Arg Ser Pro Pro Ala Ser Trp Ala 
95 100 105 

Gin Leu Arg Gly Gin Arg His Asn Phe Cys Arg Ser Pro Asp Gly Ala 
110 115 120 

Gly Arg Pro Trp Cys Phe Tyr Gly Asp Ala Arg Gly Lys Val Asp Trp 
125 130 135 140 

Gly Tyr Cys Asp Cys Arg His Gly Ser Val Arg Leu Arg Gly Gly Lys 
145 150 155 

Asn Glu Phe Glu Gly Thr Val Glu Val Tyr Ala Ser Gly Val Trp Gly 



Thr Val Cys Ser Ser His Trp Asp Asp Ser Asp Ala Ser Val lie Cys 
175 180 185 

His Gin Leu Gin Leu Gly Gly Lys Gly lie Ala Lys Gin Thr Pro Phe 
190 195 200 

Ser Gly Leu Gly Leu lie Pro lie Tyr Trp Ser Asn Val Arg Cys Arg 
205 210 215 220 

Gly Asp Glu Glu Asn lie Leu Leu Cys Glu Lys Asp lie Trp Gin Gly 
225 230 235 

Gly Val Cys Pro Gin Lys Met Ala Ala Ala Val Thr Cys Ser Phe Ser 
240 245 250 

His Gly Pro Thr Phe Pro lie lie Arg Leu Ala Gly Gly Ser Ser Val 
255 260 265 

His Glu Gly Arg Val Glu Leu Tyr His Ala Gly Gin Trp Gly Thr Val 
270 275 280 

Cys Asp Asp Gin Trp Asp Asp Ala Asp Ala Glu Val lie Cys Arg Gin 
285 290 295 300 

Leu Gly Leu Ser Gly He Ala Lys Ala Trp His Gin Ala Tyr Phe Gly 
305 310 315 

Glu Gly Ser Gly Pro Val Met Leu Asp Glu Val Arg Cys Thr Gly Asn 
320 325 330 

Glu Leu Ser He Glu Gin Cys Pro Lys Ser Ser Trp Gly Glu His Asn 
335 340 345 

Cys Gly His Lys Glu Asp Ala Gly Val Ser Cys Thr Pro Leu Thr Asp 
350 355 360 

Gly Val He Arg Leu Ala Gly Gly Lys Gly Ser His Glu Gly Arg Leu 
365 ■ 370 375 380 

Glu Val Tyr Tyr Arg Gly Gin Trp Gly Thr Val Cys Asp Asp Gly Trp 
385 390 395 

Thr Glu Leu Asn Thr Tyr Val Val Cys Arg Gin Leu Gly Phe Lys Tyr 
400 405 410 

Gly Lys Gin Ala Ser Ala Asn His Phe Glu Glu Ser Thr Gly Pro He 
415 420 425 

Trp Leu Asp Asp Val Ser Cys Ser Gly Lys Glu Thr Arg Phe Leu Gin 
430 435 440 



Cys Ser Arg Arg Gin Trp Gly Arg His Asp Cys Ser His Arg Glu Asp 
445 450 455 460 



Val Ser lie Ala Cys Tyr Pro Gly Gly Glu Gly His Arg Leu Ser Leu 
465 470 475 

Gly Phe Pro Val Arg Leu Met Asp Gly Glu Asrx Lys Lys Glu Gly Arg 
480 485 490 

Val Glu Val Phe lie Asn Gly Gin Trp Gly Thr He Cys Asp Asp Gly 
495 500 505 

Trp Thr Asp Lys Asp Ala Ala Val lie Cys Arg Gin Leu Gly Tyr Lys 
510 515 520 

Gly Pro Ala Arg Ala Arg Thr Met Ala Tyr Phe Gly Glu Gly Lys Gly 
525 530 535 540 

Pro He His Val Asp Asn Val Lys Cys Thr Gly Asn Glu Arg Ser Leu 
545 550 555 

Ala Asp Cys lie Lys Gin Asp He Gly Arg His Asn Cys Arg His Ser 
560 565 570 

Glu Asp Ala Gly Val lie Cys Asp Tyr Phe Gly Lys Lys Ala Ser Gly 
575 580 585 

Asn Ser Asn Lys Glu Ser Leu Ser Ser Val Cys Gly Leu Arg Leu Leu 
590 595 600 

His Arg Arg Gin Lys Arg He He Gly Gly Lys Asn Ser Leu Arg Gly 
605 610 615 620 

Gly Trp Pro Trp Gin Val Ser Leu Arg Leu Lys Ser Ser His Gly Asp 
625 630 635 

Gly Arg Leu Leu Cys Gly Ala Thr Leu Leu Ser Ser Cys Trp Val Leu 
640 645 650 

Thr Ala Ala His Cys Phe Lys Arg Tyr Gly Asn Ser Thr Arg Ser Tyr 
655 660 665 

Ala Val Arg Val Gly Asp Tyr His Thr Leu Val Pro Glu Glu Phe Glu 
670 675 680 

Glu Glu He Gly Val Gin Gin He Val He His Arg Glu Tyr Arg Pro 
685 690 695 700 

Asp Arg Ser Asp Tyr Asp He Ala Leu Val Arg Leu Gin Gly Pro Glu 
705 710 715 

Glu Gin Cys Ala Arg Phe Ser Ser His Val Leu Pro Ala Cys Leu Pro 
720 725 730 

Leu Trp Arg Glu Arg Pro Gin Lys Thr Ala Ser Asn Cys Tyr He Thr 
735 740 745 



Gly Trp Gly Asp Thr Gly Arg Ala Tyr Ser Arg Thr Leu Gin Gin Ala 



750 



755 



760 



Ala lie Pro Leu 
765 

Arg Phe Thr Gly 



Arg Val Asp Ser 
800 

Arg Pro Gly Glu 
815 

Gly Cys Gly Val 
830 

Phe Val Pro Trp 

845 



Leu Pro Lys Arg 
770 

Arg Met Leu Cys 
785 

Cys Gin Gly Asp 



Ser Trp Val Val 
820 

Lys Asp Ser Pro 
835 

lie Lys Ser Val 
850 



Phe Cys Glu Glu 
775 

Ala Gly Asn Leu 
790 

Ser Gly Gly Pro 
805 

Tyr Gly Val Thr 



Gly Val Tyr Thr 
840 

Thr Lys Leu 
855 



Arg Tyr Lys Gly 
780 

His Glu His Lys 
795 

Leu Met Cys Glu 
810 

Ser Trp Gly Tyr 
825 

Lys Val Ser Ala 



<210> 3 

<211> 2356 

<212> DNA 

<213> Mus mus cuius 

<220> 

<221> sig_peptide 
<222> (24) . . (86) 

<220> 

<221> mat_peptide 
<222> (87) . . (2306) 

<220> 

<221> CDS 

<222> (24) . . (2306) 

<220> 

<221> polyA_site 

<222> one-of(2341, 2356) 

<220> 

<221> 5'UTR 
<222> (1) . . (23) 

<220> 

<221> 3'UTR 

<222> (2307) . .one-of (2341, 2356) 
<400> 3 

ggaccacact cggcgccgca gcc atg gcg 
Met Ala 
-20 



etc gcc cgc tgc gtg ctg get gtg 
Leu Ala Arg Cys Val Leu Ala Val 
-15 



att tta ggg gca ctg tct gta gtg gcc cgc get gat ccg gtc teg cgc 

lie Leu Gly Ala Leu Ser Val Val Ala Arg Ala Asp Pro Val Ser Arg 

-10 -5 -11 5 

tct ccc ctt cac cgc ccg cat ccg tec cca ccg cgt tec caa cac gcg 

Ser Pro Leu His Arg Pro His Pro Ser Pro Pro Arg Ser Gin His Ala 



cac tac ctt ecc age teg egg egg cca ccc agg ace ccg cgc ttc ccg 
His Tyr Leu Pro Ser Ser Arg Arg Pro Pro Arg Thr Pro Arg Phe Pro 



etc ccg ctg egg ate ccc get gcc cag cgc ccg cag gtc etc age acc 
Leu Pro Leu Arg lie Pro Ala Ala Gin Arg Pro Gin Val Leu Ser Thr 



ggg cac acg ccc ccg acg att cca cgc cgc tgc ggg gca gga gag teg 
Gly His Thr Pro Pro Thr He Pro Arg Arg Cys Gly Ala Gly Glu Ser 



tgg gge aat gee acc aac etc ggc gtc ccg tgt eta eae tgg gac gag 
Trp Gly Asn Ala Thr Asn Leu Gly Val Pro Cys Leu His Trp Asp Glu 



gtg ccg ccc ttc ctg gag egg teg ccc ccg gcc agt tgg get gag ctg 

Val Pro Pro Phe Leu Glu Arg Ser Pro Pro Ala Ser Trp Ala Glu Leu 

90 95 100 

cga ggg cag ccg cac aac ttc tgc egg age ccg gat ggc teg ggc aga 

Arg Gly Gin Pro His Asn Phe Cys Arg Ser Pro Asp Gly Ser Gly Arg 

105 110 115 

cct tgg tgc ttc tat egg aat gcc cag ggc aaa gta gae tgg ggc tac 

Pro Trp Cys Phe Tyr Arg Asn Ala Gin Gly Lys Val Asp Trp Gly Tyr 

120 125 130 

tgc gat tgt ggt caa ggc ccg gcg ttg ccc gtc att cgc ctt gtt ggt 

Cys Asp Cys Gly Gin Gly Pro Ala Leu Pro Val He Arg Leu Val Gly 

135 140 145 

ggg aac agt ggg eat gaa ggt cga gtg gag ctg tac eae get gge cag 

Gly Asn Ser Gly His Glu Gly Arg Val Glu Leu Tyr His Ala Gly Gin 

150 155 160 165 

tgg ggg acc ate tgt gac gac caa tgg gac aat gca gac gca gac gtc 

Trp Gly Thr He Cys Asp Asp Gin Trp Asp Asn Ala Asp Ala Asp Val 

170 175 180 

ate tgt agg cag ctg ggg etc agt ggc att gcc aaa gca tgg cat cag 

He Cys Arg Gin Leu Gly Leu Ser Gly He Ala Lys Ala Trp His Gin 

185 190 195 

gca eat ttt ggg gaa gga tet ggc cca ata ttg ttg gat gaa gta egc 

Ala His Phe Gly Glu Gly Ser Gly Pro He Leu Leu Asp Glu Val Arg 

200 205 210 



tgc acc gga aac gag ctg tea att gag caa tgt cca aag agt tec tgg 
Cys Thr Gly Asn Glu Leu Ser lie Glu Gin Cys Pro Lys Ser Ser Trp 
215 220 225 

ggc gaa cat aac tgt ggc cat aaa gaa gat get gga gtg tct tgt gtt 
Gly Glu His Asn Cys Gly His Lys Glu Asp Ala Gly Val Ser Cys Val 
230 235 240 245 

cct eta aca gat ggt gtc ate aga ctg gca gga gga aaa agt acc cat 
Pro Leu Thr Asp Gly Val lie Arg Leu Ala Gly Gly Lys Ser Thr His 
250 255 260 

gaa ggt ego ctg gag gtc tac tac aag ggg cag tgg ggg aca gtc tgt 
Glu Gly Arg Leu Glu Val Tyr Tyr Lys Gly Gin Trp Gly Thr Val Cys 
265 270 275 

gat gat ggc tgg act gag atg aac aca tac gtg get tgt ega ctg ctg 
Asp Asp Gly Trp Thr Glu Met Asn Thr Tyr Val Ala Cys Arg Leu Leu 
280 285 290 

gga ttt aaa tac ggc aaa cag tec tct gtg aac cat ttt gat ggc age 
Gly Phe Lys Tyr Gly Lys Gin Ser Ser Val Asn His Phe Asp Gly Ser 
295 300 305 

aac agg ccc ata tgg ctg gat gac gtc age tgc tea gga aaa gaa gtc 
Asn Arg Pro lie Trp Leu Asp Asp Val Ser Cys Ser Gly Lys Glu Val 
310 315 320 325 

age ttc att cag tgt tec agg aga cag tgg gga agg cat gac tgc age 
Ser Phe lie Gin Cys Ser Arg Arg Gin Trp Gly Arg His Asp Cys Ser 
330 335 340 

cat aga gaa gat gtg ggc etc ace tgc tat eet gac age gat gga cat 
His Arg Glu Asp Val Gly Leu Thr Cys Tyr Pro Asp Ser Asp Gly His 
345 350 355 

agg ctt tct cca ggt ttt ccc ate aga eta gtg gat gga gag aat aag 
Arg Leu Ser Pro Gly Phe Pro lie Arg Leu Val Asp Gly Glu Asn Lys 
360 365 370 

aag gaa gga ega gtg gag gtt ttt gtc aat ggc caa tgg gga aca ate 
Lys Glu Gly Arg Val Glu Val Phe Val Asn Gly Gin Trp Gly Thr lie 
375 380 385 

tgc gat gac gga tgg acc gat aag cat gca get gtg ate tgc egg caa 
Cys Asp Asp Gly Trp Thr Asp Lys His Ala Ala Val lie Cys Arg Gin 
390 395 400 405 

ett ggc tat aag ggt cct gee aga gca agg act atg get tat ttt ggg 
Leu Gly Tyr Lys Gly Pro Ala Arg Ala Arg Thr Met Ala Tyr Phe Gly 
410 415 420 

gaa gga aaa ggc ccc ate eac atg gat aat gtg aag tgc aca gga aat 
Glu Gly Lys Gly Pro lie His Met Asp Asn Val Lys Cys Thr Gly Asn 
425 430 435 



- 11 - 



gag aag gcc ctg get gac tgt gtc aaa caa gac att gga agg cac aac 
Glu Lys Ala Leu Ala Asp Cys Val Lys Gin Asp lie Gly Arg His Asn 
440 445 450 

tgc cgc cac agt gag gat gca gga gtc ate tgt gac tat tta gag aag 
Cys Arg His Ser Glu Asp Ala Gly Val lie Cys Asp Tyr Leu Glu Lys 
455 460 465 

aaa gca tea agt agt ggt aat aaa gag atg etc tea tct gga tgt gga 
Lys Ala Ser Ser Ser Gly Asn Lys Glu Met Leu Ser Ser Gly Cys Gly 
470 475 480 485 

ctg agg tta ctg cac cgt egg cag aaa egg ate att ggt ggg aac aat 
Leu Arg Leu Leu His Arg Arg Gin Lys Arg lie lie Gly Gly Asn Asn 
490 495 500 

tct tta agg ggt gcc tgg cct tgg cag get tec etc agg ctg agg teg 
Ser Leu Arg Gly Ala Trp Pro Trp Gin Ala Ser Leu Arg Leu Arg Ser 
505 510 515 

gcc cat gga gac ggc agg ctg ctt tgt gga get ace ctt ctg agt age 
Ala His Gly Asp Gly Arg Leu Leu Cys Gly Ala Thr Leu Leu Ser Ser 
520 525 530 

tgc tgg gtc ctg aca get gca cac tge ttc aaa agg tac gga aac aac 
Cys Trp Val Leu Thr Ala Ala His Cys Phe Lys Arg Tyr Gly Asn Asn 
535 540 545 

teg agg age tat gca gtt cga gtt ggg gat tat cat act ctg gtc cca 
Ser Arg Ser Tyr Ala Val Arg Val Gly Asp Tyr His Thr Leu Val Pro 
550 555 560 565 

gag gag ttt gaa caa gaa ata ggg gtt caa cag att gtg att cac agg 
Glu Glu Phe Glu Gin Glu lie Gly Val Gin Gin lie Val lie His Arg 
570 575 580 

aac tac agg cca gac aga age gac tat gac att gee ctg gtt aga ttg 
Asn Tyr Arg Pro Asp Arg Ser Asp Tyr Asp lie Ala Leu Val Arg Leu 
585 590 595 

caa gga cca ggg gag caa tgt gcc aga eta age ace cac gtt ttg cca 
Gin Gly Pro Gly Glu Gin Cys Ala Arg Leu Ser Thr His Val Leu Pro 
600 605 610 

gcc tgt tta cet eta tgg aga gag agg cca cag aaa aca gee tec aac 
Ala Cys Leu Pro Leu Trp Arg Glu Arg Pro Gin Lys Thr Ala Ser Asn 
615 620 625 

tgt cac ata aca gga tgg gga gac aca ggt cgt gcc tac tea aga act 
Cys His lie Thr Gly Trp Gly Asp Thr Gly Arg Ala Tyr Ser Arg Thr 
630 635 640 645 

eta caa caa get get gtg cct ctg tta cec aag agg ttt tgt aaa gag 
Leu Gin Gin Ala Ala Val Pro Leu Leu Pro Lys Arg Phe Cys Lys Glu 
650 655 660 
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agg tac aag gga eta ttt act ggg aga atg etc tgt get ggg aac etc 
Arg Tyr Lys Gly Leu Phe Thr Gly Arg Met Leu Cys Ala Gly Asn Leu 
665 670 675 

caa gaa gac aac cgt gtg gae age tgc cag gga gac agt gga gga cea 
Gin Glu Asp Asn Arg Val Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro 
680 685 690 

etc atg tgt gaa aag ect gat gag tec tgg gtt gtg tat ggg gtg act 
Leu Met Cys Glu Lys Pro Asp Glu Ser Trp Val Val Tyr Gly Val Thr 
695 700 705 

tec tgg ggg tat gga tgt gga gtc aaa gac act cot gga gtt tat acc 
Ser Trp Gly Tyr Gly Cys Gly Val Lys Asp Thr Pro Gly Val Tyr Thr 
710 715 720 725 

aga gtc ccc get ttt gta ect tgg ata aaa agt gtc ace agt ctg 
Arg Val Pro Ala Phe Val Pro Trp lie Lys Ser Val Thr Ser Leu 
730 735 740 

taacttatgg aaagctcaag aaatagtaaa acagtaaeta ttcagtcttc 



<210> 4 
<211> 761 
<212> PRT 

<213> Mus musculus 

<400> 4 

Met Ala Leu Ala Arg Cys Val Leu Ala Val lie Leu Gly Ala Leu Ser 
-20 -15 -10 

Val Val Ala Arg Ala Asp Pro Val Ser Arg Ser Pro Leu His Arg Pro 
-5 -11 5 10 

His Pro Ser Pro Pro Arg Ser Gin His Ala His Tyr Leu Pro Ser Ser 
15 20 25 

Arg Arg Pro Pro Arg Thr Pro Arg Phe Pro Leu Pro Leu Arg lie Pro 
30 35 40 

Ala Ala Gin Arg Pro Gin Val Leu Ser Thr Gly His Thr Pro Pro Thr 
45 50 55 

lie Pro Arg Arg Cys Gly Ala Gly Glu Ser Trp Gly Asn Ala Thr Asn 



Leu Gly Val Pro Cys Leu His Trp Asp Glu Val Pro Pro Phe Leu Glu 

80 85 90 

Arg Ser Pro Pro Ala Ser Trp Ala Glu Leu Arg Gly Gin Pro His Asn 

95 100 105 



Phe Cys Arg Ser Pro Asp Gly Ser Gly Arg Pro Trp Cys Phe Tyr Arg 
110 115 120 



Asn Ala Gin Gly Lys Val Asp Trp Gly Tyr Cys Asp Cys Gly Gin Gly 

125 130 135 

Pro Ala Leu Pro Val lie Arg Leu Val Gly Gly Asn Ser Gly His Glu 
140 145 150 155 

Gly Arg Val Glu Leu Tyr His Ala Gly Gin Trp Gly Thr lie Cys Asp 
160 165 170 

Asp Gin Trp Asp Asn Ala Asp Ala Asp Val lie Cys Arg Gin Leu Gly 
175 180 185 

Leu Ser Gly lie Ala Lys Ala Trp His Gin Ala His Phe Gly Glu Gly 
190 195 200 

Ser Gly Pro lie Leu Leu Asp Glu Val Arg Cys Thr Gly Asn Glu Leu 
205 210 215 

Ser lie Glu Gin Cys Pro Lys Ser Ser Trp Gly Glu His Asn Cys Gly 
220 225 230 235 

His Lys Glu Asp Ala Gly Val Ser Cys Val Pro Leu Thr Asp Gly Val 
240 245 250 

lie Arg Leu Ala Gly Gly Lys Ser Thr His Glu Gly Arg Leu Glu Val 
255 260 265 

Tyr Tyr Lys Gly Gin Trp Gly Thr Val Cys Asp Asp Gly Trp Thr Glu 
270 275 280 

Met Asn Thr Tyr Val Ala Cys Arg Leu Leu Gly Phe Lys Tyr Gly Lys 
285 290 295 

Gin Ser Ser Val Asn His Phe Asp Gly Ser Asn Arg Pro lie Trp Leu 
300 305 310 315 

Asp Asp Val Ser Cys Ser Gly Lys Glu Val Ser Phe He Gin Cys Ser 
320 325 330 

Arg Arg Gin Trp Gly Arg His Asp Cys Ser His Arg Glu Asp Val Gly 
335 340 345 

Leu Thr Cys Tyr Pro Asp Ser Asp Gly His Arg Leu Ser Pro Gly Phe 
350 355 360 

Pro He Arg Leu Val Asp Gly Glu Asn Lys Lys Glu Gly Arg Val Glu 
365 370 375 

Val Phe Val Asn Gly Gin Trp Gly Thr He Cys Asp Asp Gly Trp Thr 
380 385 390 395 

Asp Lys His Ala Ala Val He Cys Arg Gin Leu Gly Tyr Lys Gly Pro 
400 405 410 

Ala Arg Ala Arg Thr Met Ala Tyr Phe Gly Glu Gly Lys Gly Pro He 
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• 




415 



420 



425 



His Met Asp Asn Val Lys Cys Thr Gly Asn Glu Lys Ala Leu Ala Asp 
430 435 440 

Cys Val Lys Gin Asp lie Gly Arg His Asn Cys Arg His Ser Glu Asp 
445 450 455 

Ala Gly Val lie Cys Asp Tyr Leu Glu Lys Lys Ala Ser Ser Ser Gly 
4S0 465 470 475 

Asn Lys Glu Met Leu Ser Ser Gly Cys Gly Leu Arg Leu Leu His Arg 



Arg Gin Lys Arg lie lie Gly Gly Asn Asn Ser Leu Arg Gly Ala Trp 



Pro Trp Gin Ala Ser Leu Arg Leu Arg Ser Ala His Gly Asp Gly Arg 
510 515 520 

Leu Leu Cys Gly Ala Thr Leu Leu Ser Ser Cys Trp Val Leu Thr Ala 
525 530 535 

Ala His Cys Phe Lys Arg Tyr Gly Asn Asn Ser Arg Ser Tyr Ala Val 
540 545 550 555 

Arg Val Gly Asp Tyr His Thr Leu Val Pro Glu Glu Phe Glu Gin Glu 
560 565 570 

lie Gly Val Gin Gin lie Val lie His Arg Asn Tyr Arg Pro Asp Arg 
575 580 585 

Ser Asp Tyr Asp lie Ala Leu Val Arg Leu Gin Gly Pro Gly Glu Gin 
590 595 600 

Cys Ala Arg Leu Ser Thr His Val Leu Pro Ala Cys Leu Pro Leu Trp 
605 610 615 

Arg Glu Arg Pro Gin Lys Thr Ala Ser Asn Cys His lie Thr Gly Trp 
620 625 630 635 

Gly Asp Thr Gly Arg Ala Tyr Ser Arg Thr Leu Gin Gin Ala Ala Val 
640 645 650 

Pro Leu Leu Pro Lys Arg Phe Cys Lys Glu Arg Tyr Lys Gly Leu Phe 
655 660 665 

Thr Gly Arg Met Leu Cys Ala Gly Asn Leu Gin Glu Asp Asn Arg Val 
670 675 680 

Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro Leu Met Cys Glu Lys Pro 
685 690 695 

Asp Glu Ser Trp Val Val Tyr Gly Val Thr Ser Trp Gly Tyr Gly Cys 
700 705 710 715 



480 



485 



490 



495 



500 



505 
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Gly Val Lys Asp Thr Pro Gly Val Tyr Thr Arg Val Pro Ala Phe Val 



<210> 5 
<211> 257 
<212> PRT 

<213> Homo sapiens 
<400> 5 

Cys Gly Leu Arg Leu Leu His Arg Arg Gin Lys Arg lie lie Gly Gly 
15 10 15 

Lys Asn Ser Leu Arg Gly Gly Trp Pro Trp Gin Val Ser Leu Arg Leu 
20 25 30 

Lys Ser Ser His Gly Asp Gly Arg Leu Leu Cys Gly Ala Thr Leu Leu 
35 40 45 

Ser Ser Cys Trp Val Leu Thr Ala Ala His Cys Phe Lys Arg Tyr Gly 
50 55 60 

Asn Ser Thr Arg Ser Tyr Ala Val Arg Val Gly Asp Tyr His Thr Leu 
65 70 75 80 

Val Pro Glu Glu Phe Glu Glu Glu lie Gly Val Gin Gin He Val He 
85 90 95 

His Arg Glu Tyr Arg Pro Asp Arg Ser Asp Tyr Asp He Ala Leu Val 
100 105 110 

Arg Leu Gin Gly Pro Glu Glu Gin Cys Ala Arg Phe Ser Ser His Val 
115 120 125 

Leu Pro Ala Cys Leu Pro Leu Trp Arg Glu Arg Pro Gin Lys Thr Ala 
130 135 140 

Ser Asn Cys Tyr He Thr Gly Trp Gly Asp Thr Gly Arg Ala Tyr Ser 
145 150 155 160 

Arg Thr Leu Gin Gin Ala Ala He Pro Leu Leu Pro Lys Arg Phe Cys 
165 170 175 

Glu Glu Arg Tyr Lys Gly Arg Phe Thr Gly Arg Met Leu Cys Ala Gly 
180 185 190 

Asn Leu His Glu His Lys Arg Val Asp Ser Cys Gin Gly Asp Ser Gly 
195 200 205 

Gly Pro Leu Met Cys Glu Arg Pro Gly Glu Ser Trp Val Val Tyr Gly 
210 215 220 

Val Thr Ser Trp Gly Tyr Gly Cys Gly Val Lys Asp Ser Pro Gly Val 



720 



725 



730 



Pro Trp He 



Lys 
735 



Ser Val Thr Ser Leu 
740 
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225 



230 



235 



240 



Tyr Thr Lys Val Ser Ala Phe Val Pro Trp lie Lys Ser Val Thr Lys 
245 250 255 

Leu 



<210> 6 
<211> 257 
<212> PRT 

<213> Mus mus cuius 
<400> 6 

Cys Gly Leu Arg Leu Leu His Arg Arg Gin Lys Arg lie lie Gly Gly 
15 10 15 

Asn Asn Ser Leu Arg Gly Ala Trp Pro Trp Gin Ala Ser Leu Arg Leu 
20 25 30 

Arg Ser Ala His Gly Asp Gly Arg Leu Leu Cys Gly Ala Thr Leu Leu 
35 40 45 

Ser Ser Cys Trp Val Leu Thr Ala Ala His Cys Phe Lys Arg Tyr Gly 
50 55 60 

Asn Asn Ser Arg Ser Tyr Ala Val Arg Val Gly Asp Tyr His Thr Leu 
65 70 75 80 

Val Pro Glu Glu Phe Glu Gin Glu He Gly Val Gin Gin He Val He 
85 90 95 

His Arg Asn Tyr Arg Pro Asp Arg Ser Asp Tyr Asp lie Ala Leu Val 
100 105 110 

Arg Leu Gin Gly Pro Gly Glu Gin Cys Ala Arg Leu Ser Thr His Val 
115 120 125 

Leu Pro Ala Cys Leu Pro Leu Trp Arg Glu Arg Pro Gin Lys Thr Ala 
130 135 140 

Ser Asn Cys His He Thr Gly Trp Gly Asp Thr Gly Arg Ala Tyr Ser 
145 ISO 155 160 

Arg Thr Leu Gin Gin Ala Ala Val Pro Leu Leu Pro Lys Arg Phe Cys 
165 170 175 

Lys Glu Arg Tyr Lys Gly Leu Phe Thr Gly Arg Met Leu Cys Ala Gly 
180 185 190 

Asn Leu Gin Glu Asp Asn Arg Val Asp Ser Cys Gin Gly Asp Ser Gly 
195 200 205 

Gly Pro Leu Met Cys Glu Lys Pro Asp Glu Ser Trp Val Val Tyr Gly 
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210 215 220 

Val Thr Ser Trp Gly Tyr Gly Cys Gly Val Lys Asp Thr Pro Gly Val 
225 230 235 240 

Tyr Thr Arg Val Pro Ala Phe Val Pro Trp lie Lys Ser Val Thr Ser 
245 250 255 



Leu 



<210> 7 

<211> 23 

<212> DNA 

<213> Mus mus cuius 

<220> 

<221> misc_feature 
<222> (6) . . (18) 

<223> Nucleotides 6, 9, 12, 15, and 18 are n wherein n = 



<400> 7 

tgggtnsynw sngcngcnca ttg 



<210> 8 

<211> 20 

<212> DNA 

<213> Mus musculus 

<220> 

<221> Tnisc_f eature 
<222> (9) . . (18) 

<22 3> Nucleotides 9, 15, and 18 are n wherein n = i. 

<4G0> 8 

acrbtyccnc trwsnccncc 



<210> 9 
<211> 14 
<212> PRT 

<213> Mus musculus 

<400> 9 

Ser Ser Cys Trp Val Leu Ser Ala Ala His Cys Phe Leu Glu 
15 10 



<210> 10 
<211> 13 
<212> PRT 

<213> Mus musculus 



18 - 



<400> 10 

His Asp Ala Cys Gin Gly Asp Ser Gly Gly Pro Leu Val 



<210> 11 
<211> 14 
<212> PRT 

<213> Mus musculus 
<400> 11 

Ser Pro Cys Trp Val Ala Ser Ala Ala His Cys Phe lie Gin 



<210> 12 
<211> 13 
<212> PRT 

<213> Mus musculus 
<400> 12 

Thr Asp Ser Cys Lys Gly Asp Ser Gly Gly Pro Leu He 



<210> 13 
<211> 14 
<212> PRT 

<213> Mus musculus 
<400> 13 

Ser Asp Arg Trp Val Leu Thr Ala Ala His Cys He Leu Tyr 



<210> 14 
<211> 13 
<212> PRT 

<213> Mus musculus 
<400> 14 

Gly Asp Ala Cys Glu Gly Asp Ser Gly Gly Pro Phe Val 



<210> 15 
<211> 14 
<212> PRT 

<213> Mus musculus 
<400> 15 

Ala Pro Glu Trp Val Leu Thr Ala Ala His Cys Leu Lys Ser 
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<210> 16 
<211> 13 
<212> PRT 

<213> Mus musculus 
<400> 16 

Val Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro Leu Val 



<210> 17 
<211> 14 
<212> PRT 

<213> Mus musculus 
<400> 17 

Asn Asp Gin Trp Val Val Ser Ala Ala His Cys Tyr Lys Tyr 



<210> 18 
<211> 13 
<212> PRT 

<213> Mus musculus 

<400> 18 

Lys Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro Val Val 



<210> 19 
<211> 14 
<212> PRT 

<213> Mus musculus 
<400> 19 

Ser Glu Asp Trp Val Val Thr Ala Ala His Cys Gly Val Lys 



<210> 20 
<211> 13 
<212> PRT 

<213> Mus musculus 
<400> 20 

Val Ser Ser Cys Met Gly Asp Ser Gly Gly Pro Leu Val 



<210> 21 
<211> 14 
<212> PRT 
<213> Mus 



musculus 
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<400> 21 

Ala Asn Asn Trp Val Leu Thr Ala Ala His Cys Leu Ser Asn 



<210> 22 
<211> 13 
<212> PRT 

<213> Mus mus cuius 
<400> 22 

Thr Ser Ser Cys Asn Gly Asp Ser Gly Gly Pro Leu Asn 



<210> 23 
<211> 32 
<212> DNA 

<213> EcoRI and BamHI 
<220> 

<221> tnisc_f eature 
<222> (15) . . (27) 

<223> Nucleotides 15, 18, 21, 24, and 27 are n wherein n 



<220> 

<221> misc_f eature 
<222> (16) 

<223> Nucleotide 16 is n wherein n c/g. 

<220> 

<221> mi sc_f eature 
<222> (17) 

<223> Nucleotide 17 is n wherein n = t/c. 

<220> 

<221> misc_f eature 
<222> (19) 

<223> Nucleotide 19 is n wherein n = t/a. 

<220> 

<221> misc_feature 
<222> (20) 

<223> Nucleotide 20 is n wherein n = g/c. 

<220> 

<221> misc_feature 
<222> (30) 

<223> Nucleotide 30 is n wherein n = t/c. 
<400> 23 

ggggaattct gggtnnnnnn ngcngcncan tg 
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<210> 24 
<211> 29 
<212> DNA 

<213> EcoRI and BamHI 
<220> 

<221> misc_feature 
<222> (12) . . (21) 

<223> Nucleotides 12, 15, and 21 are n wherein n = 

<220> 

<221> misc_feature 
<222> (16) 

<223> Nucleotide 16 is n wherein n = g/c. 
<220> 

<221> misc_feature 
<222> (17) 

<223> Nucleotide 17 is n wherein n = a/t. 
<220> 

<221> misc_f eature 
<222> (18) 

<223> Nucleotide 18 is n wherein n = a/g. 

<220> 

<221> misc_feature 
<222> (24) 

<223> Nucleotide 24 is n wherein n = c/t. 

<220> 

<221> misc_feature 
<222> (26) 

<223> Nucleotide 26 is n wherein = g/c/t. 

<220> 

<221> misc_f eature 
<222> (27) 

<22 3> Nucleotide 2 7 is n wherein n = g/a. 

<400> 24 

gggggatccc cnccnnnntc nccntnnca 



<210> 25 
<211> 33 
<212> DNA 

<213> Hindlll and Xhol 
<220> 

<221> misc_f eature 
<222> (12) . . (27) 

<223> Nucleotides 12, 21, 24, and 2 7 are n wherein 
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<220> 

<221> misc_feature 
<222> (15) 

<223> Nucleotide 15 is n wherein n = a/g. 

<220> 

<221> misc_feature 
<222> (25) 

<22 3> Nucleotide 2 5 is n wherein n = a/g. 

<220> 

<221> misc_feature 
<222> (30) 

<223> Nucleotide 3 0 is n wherein n = c/t. 

<22G> 

<221> misc_feature 
<222> (33) 

<223> Nucleotide 33 is n wherein n = c/t. 

<400> 25 

gggaagcttg gncantgggg nacnntntgn gan 33 



<21C> 26 
<211> 33 
<212> DNA 

<213> Hindlll and Xhol 
<220> 

<221> misc_feature 
<222> (15) . . (28) 

<223> Nucleotides 15 and 28 are n wherein n = i. 
<400> 26 

gggctcgagc cccancctgt tatgtaanag ttg 33 



<210> 27 

<211> 17 

<212> PRT 

<213> Mus musculus 



Ser Arg Ser Pro Leu His Arg Pro His Pro Ser Pro Pro Arg Ser Gin 



<210> 28 

<211> 13 

<212> PRT 

<213> Mus musculus 



# 



<400> 28 

Leu Pro Ser Ser Arg Arg Pro Pro Arg Thr Pro Arg Phe 
15 10 
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Patent 

Attorney's Docket No. 030708-035 



IN THE UNITED STATES PATENT AND TRADEMARK OFFICE 

In re Patent Application of 



Peter SONDEREGGER 

Application No. : 

Filed: October 26, 1999 

For: NEUROTRYPSIN 

PRELIMINARY AMENDMENT 

Assistant Commissioner for Patents 
Washington, D.C. 20231 



Group Art Unit: Unassigned 
Examiner: Unassigned 



Prior to examination on the merits, please amend the 
subject application as follows: 



IN THE CLAIMS : 

Please cancel claims 1-46 without prejudice or 
disclaimer . 

Please add the following new claims 47-61: 
-- ^47. Neurotrypsins of the formulas I and II 
I : neurotrypsin of the human 
II: neurotrypsin of the mouse 



48. Neurotrypsin according to claim 47, characterized in 
that the compounds of the formulas I and II comprise the 
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separate, coding nucleotide sequences and the coded amino acid 
sequences of the compounds of the formulas I or II. 

^ 49. Use of the coding nucleotide sequences of the 
compounds of the formulas I or II for the production of 
recombinant proteins. 

^ 50. Use of proteins with the coded amino acid sequences 
of the compounds of the formulas I or II as targets for the 
development of pharmaceutical drugs, for example for the 
inhibition or the enhancement of the catalytic activity of the 
coded proteins of the formulas I or II. 

^ 51. Use of the species -homologous proteins of the 
compounds of the formulas I or II as targets for the 
development of pharmaceutical drugs, for example for the 
inhibition or the enhancement of the catalytic activity of the 
coded proteins of the formulas I or II. 

52 . Use of the proteins with the coded amino acid 
sequences of the compounds of the formulas I or II for the 

- 2 - 
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spatial structure determination, for example the spatial 
structure determination by means of crystallography or nuclear 
resonance spectroscopy. 

t--- 53 . Use of the coded amino acid sequences of the 
compounds of the formulas I or II for the prediction of the 
protein structure by means of computerized protein structure 
prediction methods. 

54 . Use of the spatial structure of the coded amino acid 
sequences of the compounds of the formulas I or II as targets 
for the development of pharmaceutical drugs, for example for 
the inhibition or the enhancement of the catalytic activity of 
the coded proteins of the compounds of the formulas I or II. 

55. Use of the coding nucleotide sequences of the 
compounds of the formulas I or II in gene therapeutical 
applications in humans and in animals, as for example as parts 
of gene therapy vectors as for example as parts of artificial 
chromosomes . 



- 3 - 
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56. Use the compounds of the formulas I or II for so- 
called cell engineering applications for the production of 
gene technologically mutated cells, which produce the coded 
sequences . 

57 . Use of the coded amino acid sequences of the 
compounds of the formulas I or II as antigens for the 
production of antibodies, as for example antibodies that 
inhibit or promote the protease function or antibodies that 
can be used for immunohistochemical studies. 

58. Use of the coding nucleotide sequences of the 
compounds of the formulas I or II for the production of 
transgenic animals, as for example transgenic mice. 

59. Use of the coding nucleotide sequences of the 
compounds of the formulas I or II for the inactivation or the 
mutation of the corresponding gene by means of gene targeting 
techniques, as for example the elimination of the gene in the 
mouse through homologous recombination. 

- 4 - 
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60. Use of the compounds of the formulas I or II for the 
diagnostics of disorders in the gene corresponding to the 
compound of the formula I. 

61. Use of the coding nucleotide sequences of the 
compounds of the formulas I or II as a starting sequence for 
gene technological modifications aimed at the production of 
pharmaceutical compositions or gene therapy vectors which 

'exhibit changed properties as compared with the corresponding 
pharmaceutical compositions or gene therapy vectors containing 
the coding nucleotide sequence of the compounds of formulas I 
or II, for example changed proteolytic activity, changed 
proteolytic specificity, or changed pharmacokinetic 
characteristics . -- 

REMARKS 

Support for the new claims can be found, at least, in 
original claims 1-46. 
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Early and favorable consideration of the subject 
application is earnestly solicited. 

Respectfully submitted. 

Burns, Doane, Swecker & Mathis, L.L.P. 



P.O. Box 1404 




Alexandria, Virginia 22 313-14 04 |]i a , -^-y ^2/7 
(703) 836-6620 \^-^i^6^ y 



Date: October 26, 1999 
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Neurotrypsir^ 

Technical Field 

5 

The present invention is directed to neurotrypsins and to a pharmaceutical 
composition which contains these substances or has an influence on these substances. 

10 Disclosure of Invention 

Neurotrypsin is a newly discovered serine protease, which is predominantly 
expressed in the brain and in the lungs; the expression in the brain takes place nearly 
exclusively in the neurons. 

15 

Neurotrypsin has a previously not yet found domain composition: besides the 
protease domain, there are found 3 or 4 SRCR (scavenger receptor cysteine-rich) 
domains and one Kringie domain. It is to be pointed out that the combination of Kringle 
and SRCR domains have not yet been found in proteins. At the amino terminus of the 
20 neurotrypsin protein there is a segment of more than 60 amino acids, which has an 
extremely high proportion of proline and basic amino acids (arginine and histidine). 

The invention is -characterized by the characteristics in the independent claims. 
Preferred embodiments are defined in the dependent claims. 

25 

The newly found neurotrypsins 

- neurotrypsin of the human (compound of the formula I), 

- neurotrypsin of the mouse (compound of the formula II) 

30 differ structurally very much from the so far known serine proteases. 

The serine protease whose protease domain is structurally most closely related 
with the protease domain of the new compounds, namely plasmin (of the human), has 
only a 44 % amino acid sequence identity. 

35 

The proline-rich, basic segment at the amino terminus has a certain resemblance 
with the basic segments of the netrins and the semaphorins/collapsins. Due to this 
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segment, it is probable that neurotrypsin may be enriched by means of heparin-affinity 
chromatography. 

The neurotrypsins of the human (compound of the formula I) and of the mouse 
5 (compound of the formula II) exhibit a very high structural similarity among each other. 

The identity of the amino acid sequences of the native proteins of the compounds 
of the formulas i or II amounts to 81 %. 

10 The neurotrypsin of the human (compound of the formula I) has a coding 

sequence of 2625 nucleotides. The coded peptide of the compound of the formula I has 
a length of 875 amino acids and contains a signal peptide of 20 amino acids. The 
neurotrypsin of the mouse (compound of the formula II) has a coding sequence of 2283 
nucleotides. The coded protein of the compound of the formula II has a length of 761 

15 amino acids and contains a signal peptide of 21 amino acids. The reason for the greater 
length of the neurotrypsin of the human consists therein that the human neurotrypsin has 
4 SRCR domains, whereas the neurotrypsin of the mouse has only 3 SRCR domains. 

The domains which are present in both compounds (compound of the formula I 
20 and compound of the formula II) have a high degree of sequence similarity. The 
corresponding SRCR domains of the compounds of the formulas I and 11 have an amino 
acid sequence identity from 81% to 91%. The corresponding Kringle domains have an 
amino acid sequence identity of 75%. A high degree of similarity consists also in the 
enzymaticaliy active (i.e. proteolytic) domain (90% amino acid sequence identity). 

25 

The protease domains of the neurotrypsins of the human (compound of the 
formula I) and of the mouse (compound of the formula II) are aligned In the following 
section, in order to illustrate the high degree of sequence identity. 
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CGLRLLHRRQKRIIGGKNSLRGGWPWQVSLRLKSSHGDGRLLCGATLLSS 

! I I I I I I i ! I I M I I I ■ I ! I I h I I i i - I I I ! : M I I I ! I I I I I I ! I I I 

CGLRLLHRRQKRIIGGNNSLRGAWPWQASLRLRSAHGDGRLLCGATLLSS 



CWVLTAAKCFKRYGNSTRSYAVRVGDYHTLVPEEFEEEIGVQQIVIHREY 

! I I M M I I { I M I 1 - ■ ! I I I I I i I i i I I I M I I I h I I M M M I I i : I 

CWVLTAAHCFKRYGNNSRSYAVRVGDYHTLVPEEFEQEIGVQQIVIHRNY 



RPDRSDYDIALVRLQGPEEQCARFSSHVLPACLPLWRERPQKTASNCYIT 150 
I I I I I I I I I I I 1 I i I I h I I I I I : i - I i I I I I I I i I i I I i I I I I M I • I I 
RPDRSDYDIALVRLQGPGEQCARLSTHVLPACLPLWRERPQKTASNCHIT 



GWGDTGRAYSRTLQQAAIPLLPKRFCEERYKGRFTGRMLCAGNLHEHKRV 
i I M I I i I ! i ! ! I I I I I : M I I i i I I ■ I I I I I I I M I I I I i I I : i ■ ■ I I 
GWGDTGRAYSRTLQQAAVPLLPKRFCKERYKGLFTGRMLCAGNLQEDNRV 



DSCQGDSGGPLMCERPGESWVVYGVTSWGYGCGVKDSPGVYTKVSAFVPW 

I M ! i I I I I I I I I j : I : 1 1 ! I I ! I M 1 I I I j 1 I i ! i - M ! 1 I : i - I I I 1 1 

DSCQGDSGGPLMCEKPDESWVVYGVTSWGYGCGVKDTPGVYTRVPAFVPW 



IKSVTKL 

I I j I M 

IKSVTSL 



From the 258 amino acid sequence positions included in tine comparison there are 
233 amino acids that are identical in both compounds (upper sequence: compound of 
the formula t; lower sequence: compound of the formula II; identical amino acids are 
5 indicated by vertical lines). 

The inventive neurotrypsins are unique when compared with the known serine 
proteases in that they are expressed according to currently available observations in a 
distinct degree in neurons. A further organ with a strong expression of neurotrypsin are 
10 the lungs (see Gschwend et al., Mol. Cell. Neurosci. 2, pages 207-219, 1997). 
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The proteins that are structurally most similar to the compounds of the formulas I 
or II are serine proteases, such as tissue-type plasminogen activator (tPA), urokinase- 
type plasminogen activator (uPA), plasmin, trypsin, apolipoprotein (a), coagulation factor 
XI, neuropsin, and acrosin. 

5 

In the adult brain, the inventive compounds are expressed predomiantly in the 
cerebral cortex, the hippocampus, and the amygdala. 

In the adult brain stem and the spinal cord, the inventive compounds are 
10 expressed predominantly in the motor neurons. A slightly weaker expression Is found in 
the neurons of the superficial layers of the dorsal horn of the spinal cord. 

In the adult peripheral nervous system, the inventive compounds are expressed in 
a subpopulation of the sensory ganglia neurons. 

15 

The inventive compounds were found in connection with a study aimed at 
discovering trypsin-like serine proteases in the nervous system. 

The first compound that was found and characterized was the compound of the 
20 formula II (Gschwend et al., Mol. Ceil. Neurosci. 9, pages 207-219, 1997). 

By means of an alignment of the protease domains of 7 known serine proteases 
(tissue-type plasminogen activator, urokinase-type plasminogen activator, thrombin, 
plasmin, trypsin, chymotrypsin, and pancreatic elastase) in the proximity of the histidine 
25 and the serine of the catalytic triade of the active site, the sequences of the so-called 
primer oligonucleotides for the polymerase chain reaction were determined. 

The primer oligonucleotides were used in a polymerase chain reaction (PGR) 
together with ss-cDNA from total RNA of the brains of 10 days old mice and resulted in 
30 the amplification of a cDNA fragment of a length of approximately 500 base pairs. 

This cDNA fragment was used successfully for the isolation of further cDNA 
fragments by screening commercially available cDNA libraries. Together, the isolated 
cDNA fragments covered the full length of the coding part of the compound of the 
35 formula II. 
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By conventional DNA sequencing the complete nucleotide sequence and the 
amino acid sequence deduced therefrom was obtained. 

5 The compound of the formula I was cloned based on its pronounced similarity with 

the compound of the formula II. 

The primer oligonucleotides used were synthesized according to the known 
sequence of the compound of the formula II. 

10 

The cloning of the compound of the formula 1 was performed by means of two 
commercially available cDNA libraries from fetal human brain. 

This procedure for the cloning can also be used for the isolation of the homologous 
15 compounds of other species, such as rat, rabbit, guinea pig, cow, sheep, pig, primates, 
birds, zebra fish (Brachydanio rerio), Drosophila meianogaster, Caenorhabditis eiegans 
etc. 

The coding nucleotide sequences can be used for the production of proteins with 
20 the coded amino acid sequences of the compounds of the formulas I or II. A procedure 
developed in our laboratory allows the production of recombinant proteins in myeloma 
cells as fusion proteins with an immunoglobulin domain (constant domain of the kappa 
light chain). The principle of the construction is given in detail by Rader et al. (Rader et 
al., Eur. J. Biochem. 215, pages 133-141, 1993). The fusion protein produced by the 
25 myeloma cells was Isolated by Immunoaffinity chromatography using a monoclonal 
antibody against the Ig domain of the kappa light chain. With the same expression 
method, also the native protein of a compound, starting from the coding sequence, can 
be produced. 

30 The coding sequences of the compounds of the formulas I or II can be used as 

starting compounds for the discovery and the isolation of alleles of the compounds of the 
formulas I or II. Both the polymerase chain reaction and the nucleic acid hybridization 
can be used for this purpose. 



SUBSTITUTE SHEET (RULE 25) 





wo 98/49322 



PCT/IB98/00625 



- 6 - 



The coding sequences of the compounds of the formulas I or II can be used as 
starting compounds for so-called "site-directed mutagenesis", in order to generate 
nucleotide sequences coding the coded proteins that are defined by the compounds of 
the formulas I or II, or parts thereof, but whose nucleotide sequence is degenerated with 
5 respect to the compounds of the formulas I or ii due to use of alternative codons. 

The coding sequences of the compounds of the formulas I or II can be used as 
starting compounds for the production of sequence variants by means of so-called site- 
directed mutagenesis. 

10 
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Best Modes for Carrying out the Invention (Examples) 

cDNA cloning of the compound of the formula II (neurotrvpsin of the mouse) 

5 Total RNA was isolated from the brains of 10 days old mice (ICR-ZUR) according 

to the method of Chomczynski and Sacchi (1987). The production of single stranded 
cDNA was carried out using oligo(dT) primer and a RNA-dependent DNA polymerase 
(Superscript RNase H -Reverse Transcriptase; Gibco BRL, Gaithersburg, MD) according 
to the instruction of the supplier. For the realization of the polymerase chain reaction one 

10 fonrt/ard primer was synthesized based on the amino acid sequence of the region of the 
conserved histidine of the catalytic triade and one primer in the backward direction was 
synthesized based on the amino acid sequence of the region of the conserved serine of 
the catalytic triade of the serine proteases. The amino acid sequences used for the 
determination of the oligonucleotide primers were taken from seven known serine 

15 proteases. They are presented in the following. 



. -SSC 
. .SPG 
. .SDR 
. .APE 
. .NDQ 
. .SED 
. .ANN 



WVL.SAAHC 
WVASAAHC 
WVLTAAHC 
WVLTAAHC 
WVVSAAHC 



FLE HDA 

FIQ TDS 

ILY GDA 

LKS VDS 

YKY KDS 

GVK VSS 

LSN TSS 



C Q G D S <3 a 

C K G D S G G 

C E G D S G G 

C Q G D S G G 

C Q G D S G G 

cFmIg D S G G 

CtN G D S G G 



-TGG GTI SYI 1 



■ GCI GCI CAT TG-3* 



The protease domains of 7 known serine proteases (tissue-type plasminogen 
activator, urokinase-type plasminogen activator, thrombin, piasmin, trypsin, 
chymotrypsin, and pancreatic elastase) were aligned in the region of the conserved 
20 histidine and serine of the catalytic triade of the active site. The conserved amino acids 
of these regions were taken as the basis for the determination of the degenerated 
primers. The primer sequences are given according to the recommendation of the lUB 
nomenclature (Nomenclature Committee 1985). 

25 The primers used in the PGR contained restriction sites for EcoRl and BamHI at 

their 5' ends in order to facilitate a subsequent cloning. 
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The following primers were used: 

In the reading direction (sense primers): 

5'-GGGGAATTCTGGGTI(C/G)(T/C)l(T/A)(G/C)]GCIGClCA(T/C)TG-3' 
5 In the counter direction (antisense primers): 

5*-GGGGGATCCCCICC!(G/C)(A/r){A/G)TCICC(CyT)T(G/C/T)(G/A)CA-3'. 

The polymerase chain reaction was carried out under standard conditions using 
the DNA polymerase AmpliTaq (Perkin Elmer) according to the recommendations of the 
10 producer. The following PGR profile was employed: 93°C for 3 minutes, followed by 35 
cycles of 93''G for 1 minute, 48°C for 2 minutes, and 72°C for 2 minutes. Following the 
last cycle, the incubation was continued at 72°C for further 10 minutes. 

The amplified fragments had an approximate length of 500 base pairs. They were 
15 cut with EcoRi and SamHI and inserted In a Blue Script vector (Biuescript SK(-), 
Stratagene). The resulting clones were analyzed by DNA sequence determination using 
the dideoxy chain termination method (Sanger et al... Proc. Natl. Acad. Sci. USA 77, 
pages 2163-2167, 1977) on an automated DNA sequencer (LI-COR, model 4000L; 
Lincoln, NE) using a commercial sequencing kit (SequiTerm long-read cycle sequencing 
20 kit-LC; Epicentre Technologies, Madison, Wl). The analysis yielded a sequence of 474 
base pairs of the catalytic region of the serine protease domain of the compound of the 
formula il. 

The 474 base pair long PGR fragment was used for screening of an oiigo(dT)- 
25 primed Uni-ZAP-XR cDNA library from the brain of 20 days old mice (Stratagene; cat. 
no. 937 319). At total of 3 x 10* lambda plaques were screened under high stringent 
conditions (Sambrook et al., Molecular Cloning: A Laboratory Manual, Cold Spring 
Harbor Laboratory Press, 1 989) using a radioactively labeled PCR fragment as a probe 
and 24 positive clones were found. 

30 

From the positive Lambda-Uni-ZAP-XR phagemid clones the corresponding 
Biuescript ptasmid was cut out by in vivo excision according to a standard method 
recommended by the producer (Stratagene). In order to determine the length of the 
inserted fragments the corresponding Biuescript piasmid clones were digested with Sad 
35 and KpnI. The clones containing the longest fragments were analyzed by DNA 
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sequencing (as described above) and for subsequent data analysis the GCG software 
(version 8.1 , Unix; Silicon Graphics, Inc.) was used. 

Because none of the clones contained the coding sequence in full length, a second 
5 cDNA library was screened. The library used in this screen was an otigo(dT)- and 
random-prinried cDNA library in a Lambda phage (Lambda gt10) which was based on 
mRNA from 15 days old mouse embryos (oligo(dT)- and random-primed Lambda gtIO 
cDNA library; Clontech, Palo Alto, CA; cat. no. ML 3002a). As a probe a radioactively 
labeled DNA fragment (Aval/AatI!) from the 5' end of the longest clone of the first screen 
10 was used and approximately 2x10^ plaques were screened. This screen resulted in 14 
positive clones. The cDNA fragments were excised with EcoRI and cloned into the 
Bluecript vector (KS(+); Stratagene). The sequence analysis was carried out as 
described above. 

15 In this way the nucleotide sequence over the full length cDNA of 2361 and 2376 

base pairs, respectively, of the compound of the formula II was obtained. With the 
described procedure of PGR cloning it is possible to fin.d and isolate also variant forms of 
the compounds of the formulas 1 or 11, as for example their alleles or their splice variants. 
The described method of screening of a cDNA library allows also the detection and the 

20 isolation of compounds which hybridize under stringent conditions with the coding 
sequences of the compounds of the formulas 1 or II. 
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Ctoninq of the cDNA of the compound of the formula \ (neurotrypsin o1 the human) 

The cloning of the cDNA of the compound of the formula I was carried out basing 
5 on the nucleotide sequence of the compound of the formula II. As a first step, a fragment 
of the compound of the formula I was ampiified using the polymerase chain reaction 
(PGR). As a matrix we used the DNA obtained from a cDNA library from the brain of a 
human fetus (17'" - 18'" week of pregnancy) which is commercially available (Oiigo(dT)- 
and random-primed, human fetal brain cDNA library in the Lambda ZAP II vector, cat. 
10 no. 936206, Stratagene). The synthetic PGR primers contained restriction sites for 
HindWl and Xho\ at the 5' end in order to facilitate the subsequent cloning. 



In the reading direction (sense primers): 

5'-GGGAAGGTTGGICA(A/G)TGGGGlAGI(A/G)TlTG(G/T)GA(G/T)-3' 
15 In the counter direction (antisense primers): 

5"-GGGGTCGAGCCCGAICCTGTTATGTAAIAGTTG-3' 



The PGR was carried out under standard conditions using the DNA polymerase 
20 Amplitaq (Perkin Elmer) according to the recommendations of the producer. The 
resulting fragment of 1116 base pairs was inserted into the Biuescript vector (Biuescript 
SK(-), Stratagene). A 600 base pairs long HindWM Stu\ fragment, corresponding to the 5' 
half the 1116 base pairs long PGR fragment, was used for the screening of a Lamda 
cDNA library from human fetal brain (Human Fetal Brain 5'-STRETCH PLUS cDNA 
25 library; Lambda gt10; cat. no. HL 3003 a; Clontech). 2x10^ Lambda plaques were 
screened under high stringent conditions (Sambrook et al., Molecular Cloning: A 
laboratory manual, Cold Spring Harbor Laboratory Press, 1989) by means of a 
radioactiveiy labeled PGR fragment, and 23 positive clones were found and isolated. 



30 From the positive Lambda gtIO clones the corresponding cDNA fragments were 

excised with EcoRI and inserted into a Biuescript vector (Biuescript KS(+), Stratagene). 
The sequencing was carried out by means of the dideoxy chain termination method 
(Sanger et al., Proc. Natl. Acad. Sci. USA 77, pages 2163-2167, 1977), using a 
commercial sequencing kit (SequiTherm long-read cycle sequencing kit-LC; Epicentre 

35 Technologies, Madison, Wl) and Bluescript-specific primers. 
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In an alternative sequencing strategy, the cDNA fragments of the positive Lambda 
gtIO clones were PGR amplified using Lambda-specific primers. The sequencing was 
carried out as described above. 



The computerized analysis of the sequences was performed by means of the 
program package GCG (version 8.1, Unix; Silicon Graphics Inc.). 

In this way the nucleotide sequence over the full length of the cDNA of 3350 base 
10 pairs was obtained. With the described procedure for PGR cloning it is possible to find 
and to isolate also variant forms of the compounds of the formulas I or 11, as for example 
their alleles or their splice variants. The described procedure for the screening of a 
cDNA library allows also the discovery and the isolation of compounds which hybridize 
under stringent conditions with the coding sequences of the compounds of the formulas I 



5 



15 



or II. 
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Visualization of the coded sequences of the compounds of the formulas I or 11 by 
means of antibodies 

5 The more than 60 amino acids long proiine-rich, basic segment at the amino 

terminus of the coded sequence of the compounds of the formulas 1 or II is well suited 
for the production of antibodies by means of synthesizing peptides and using them for 
immunization. We have selected two peptide sequences with a length of 19 and 13 
amino acids from the proline-rich, basic segment at the amino terminus of the coded 

10 sequence of the compound of the formula II for the generation of antibodies. The 
peptides had the following sequences: 
Peptide 1 : H,N-SRS PLH RPH PSP PRS QX-CONH, 
Peptide 2: H^N-LPS SRR PPR TPR F-COOH 

15 The two peptides were synthesized chemically, coupled to a macromoiecular 

carrier (Keyhole Limpet Hemacyanin), and injected into 2 rabbits for immunization. The 
resulting antisera exhibit a high antibody titer and could successfully be used both for the 
identification of native neurotrypsin in brain extract of the mouse and for the identification 
of recombinant neurotrypsin. The employed procedure for the generation of antibodies 

20 can also be used for the generation of antibodies against the coded sequence of the 
compound of the formula I. 

The resulting antibodies against the partial sequences of the coded sequences of 
the compounds of the formulas I or II can be used for the detection and the isolation of 
25 variant forms of the compounds of the formulas I or II, as for example alleles or splice 
variants. Such antibodies can also be used for the detection and isolation of gene 
technologically generated variants of the compounds of the formulas I or II. 
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Purification of the coded sequences of the compounds of the formulas 1 or U 

Besides conventional chromatographic methods, as for example ion exchange 
5 chromatography, the purification of the coded sequences of the compounds of the 
formulas i or II can also be achieved using two affinity chromatographic purification 
procedures. One affinity chromatographic purification procedure is based on the 
availability of antibodies. By coupling the antibodies on a chromatographic matrix, a 
purification procedure results, in which a very high degree of purity of the corresponding 
10 compound can be achieved in one step. 

Another Important feature that can be used for the purification of the coded 
sequences of the compounds of the formulas I or II is the prollne-rlch, basic segment at 
the amino terminus. It may be expected that, due to the high density of positive charges, 

15 this segment mediates the binding of the coded sequences of the compounds of the 
formulas i or il to heparin and heparin-like affinity matrices. This principle allows also the 
isolation, or at least the enrichment, of variant forms of the coded sequences of the 
compounds of the formulas I or 11, as for example their alleles or splice variants. Likewise 
the heparin affinity chromatography can be used for the isolation, or at least the 

20 enrichment, of species-homologous proteins of the compounds of the formulas 1 or IL 
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Industrial ApplicabiHtv 

The coding sequences of the formulas I and II can be used for the production of 
the coded proteins or parts thereof of the formulas I and li. The production of the coded 
5 proteins can be achieved in procaryotic or eucaryotic expression systems. 

The gene expression pattern of the inventive compounds in the brain is extremely 
interesting, because these molecules are expressed in the adult nervous system 
predominantly in neurons of those regions that are thought to play an important role in 

10 learning and memory functions. Together with the recently found evidence for a role of 
extracellular proteases in neural plasticity, the expression pattern allows the assumption 
that the proteolytic activity of neurotrypsin has a role in structural reorganizations in 
connection with learning and memory operations, for example operations which are 
involved in the processing and storage of learned behaviors, learned emotions, or 

15 memory contents. The inventive compounds may, thus, represent a target for 
pharmaceutical intervention in malfunctions of the brain. 

The gene expression pattern of the inventive compounds in the cerebral cortex 
(especially layers V and VI) is extremely interesting, because a reduction of the cellular 
20 differentiation in the cerebral cortex has been found to be associated with schizophrenia. 
The inventive compounds may, thus, be a target for pharmaceutical intervention in 
schizophrenia and related psychiatric diseases. 



25 increased in the neurons located adjacent to the damaged tissue of a focal ischemic 
stroke, indicating that the inventive compounds play a role in the tissue reaction in the 
injured cerebral tissue. The inventive compounds may, thus, represent a target for 
pharmaceutical intervention after ischemic stroke and other forms of neural tissue 
damage. 

30 

Tissue-type plasminogen activator, a serine protease related to the inventive 
compounds, has recently been found to be involved in excltotoxicity-mediated neuronal 
cell death. A similar function is conceivable for the inventive compounds and, thus, the 
inventive compounds represent a possible target for a pharmacological intervention in 
35 diseases in which cell death occurs. 



The coding sequences of the inventive compounds have been found to be 
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The gene expression pattern of the inventive compounds in the spinal cord and in 
the sensory ganglia is interesting, because these molecules are expressed in the adult 
nervous system in neurons of those brain regions that are thought to play a role in the 
5 processing of pain, as well as in the pathogenesis of pathological pain. The inventive 
compounds may, thus, be a target for pharmaceutical intervention in pathological pain. 



10 In the following part statements concerning the compounds of the formulas I or 11 



are given: 
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(1) INFORMATION ABOUT THE OOMPQl IMP OF THE FORMULA 1 
('Neurotrypsin of the human') 

(i) SEQUENCE CHARACTERiSTICS: 

(A) LENGTH: 3350 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single strand 

(D) TOPOLOGY: linear 

(ii) MOLECULE TYPE; cDNA to mRNA 

(vi) ORIGINAL SOURCE: 

(A) ORGANISM: Homo sapiens 
(D) DEVELOPMENT STAGE: fetal 
(F) TISSUE TYPE: brain 

(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: human fetal brain 5'-stretch plus cDNA library in the lambda 

gtIO vector; catalog No. HL 3003a; Clontech, Palo Alto, OA, USA. 

(B) CLONE: cDNA Clone No.: 

3-1,3-2, 3-6, 3-7, 3-8, 3-10, 3-11,3-12 



(ix) FEATURE: 

(A) NAME/KEY: Signal peptide 

(B) LOCATION: 44 .. 103 
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(ix) 



FEATURE: 



(A) 
(B) 



NAME/KEY: mature peptide 



LOCATION: 104 .. 2668 



5 



(ix) FEATURE: 

(A) NAME/KEY: coding sequence 
10 (B) LOCATION: 44 .. 2668 

(ix) FEATURE: 

15 (A) NAME/KEY: Prollne-rich, basic segment 

(B) LOCATION: 104 .. 319 

(ix) FEATURE: 



(A) NAME/KEY: Kringle domain 

(B) LOCATION: 320 .. 538 



25 (ix) FEATURE: 

(A) NAME/KEY; SRCR domain 1 

(B) LOCATION: 551 .. 856 



(ix) FEATURE: 

(A) NAME/KEY: SRCR domain 2 

(B) LOCATION: 881 .. 1186 

35 



20 



30 
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(ix) FEATURE: 

(A) NAME/KEY: SRCR domain 3 
5 (B) LOCATION: 1202 1504 



(ix) FEATURE: 

10 (A) NAME/KEY: SRCR domain 4 
(B) LOCATION: 1541 .. 1846 



(ix) FEATURE: 

15 

(A) NAME/KEY: proteolytic domain 

(B) LOCATION: 1898 .. 2668 

20 (ix) FEATURE: 

(A) NAME/KEY: histidine of the catalytic triade 

(B) LOCATION: 2069 - 2071 



(ix) FEATURE; 

(A) NAME/KEY: aspartic acid of the catalytic triade 

(B) LOCATION: 221 9 - 2221 

30 

(ix) FEATURE: 

(A) NAME/KEY: serine of the catalytic triade 

35 (B) LOCATION: 2516 .. 2518 



25 
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(ix) FEATURE: 

5 (A) NAME/KEY: poiyA signal 

(B) LOCATION: 2873 .. 2878 

(ix) FEATURE 



(A) NAME/KEY: poiyA signal 

(B) LOCATION: 3034 .. 3039 



15 (ix) FEATURE: 

(A) NAME/KEY: polyA signal 

(B) LOCATION: 321 5 .. 3220 



(ix) FEATURE: 

(A) NAME/KEY: 3'UTR 

(B) LOCATION: 2669 .. 3350 

25 

(ix) FEATURE 

(A) NAME/KEY: 5'UTR 

30 (B) LOCATION: 1 .. 43 



10 



20 
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Compound of the formula I (neurotrypsin of the human) 



CGGAAGCTGG GGAGCATGGA CCAGACCCCG CAGCGCTGGC ACC ATG ACG CTC GCC 

Met Thr Leu Ala 
-20 

CGC TTC GTG CTA GCC CTG ATG TTA GGG GCG CTC CCC GAA GTG GTC GGC 
Arg Phe Val Leu Ala Leu Met Leu Gly Ala Leu Pro Glu Val Val Gly 



TTT GAT TCT GTC CTC AAT GAT TCC CTC CAC CAC AGC CAC CGC CAT TCG 

Phe Asp Ser Val Leu Asn Asp Ser Leu His His Ser His Arg His Ser 
15 10 15 

CCC CCT GCG GGT CCG CAC TAC CCC TAT TAC CTT CCC ACC CAG CAG CGG 

Pro Pro Ala Gly Pro His Tyr Pro Tyr Tyr Leu Pro Thr Gin Gin Arg 
20 25 30 

CCC CCG ACG ACG CGT CCG CCG CCG CCT CTC CCG CGC TTC CCG CGC CCC 

Pro Pro Thr Thr Arg Pro Pro Pro Pro Leu Pro Arg Phe Pro Arg Pro 
35 40 45 

CCG CGG GCG CTC CCT GCC CAG CGC CCG CAC GCC CTC CAG GCC GGG CAC 

Pro Arg Ala Leu Pro Ala Gin Arg Pro His Ala Leu Gin Ala Gly His 
50 55 60 

ACG CCC CGG CCG CAC CCC TGG GGC TGC CCC GCC GGC GAG CCA TGG GTC 

Thr Pro Arg Pro His Pro Trp Gly Cys Pro Ala Gly Glu Pro Trp Val 



AGC GTG ACG GAC TTC GGC GCC CCG TGT CTG CGG TGG GCG GAG GTG CCA 
Ser Val Thr Asp Phe Gly Ala Pro Cys Leu Arg Trp Ala Glu Val Pro 



CCC TTC CTG GAG CGG TGG CCC CCA GCG AGC TGG GCT CAG CTG CGA GGA 
Pro Phe Leu Glu Arg Ser Pro Pro Ala Ser Trp Ala Gin Leu Arg Gly 
100 105 110 

CAG CGC CAC AAC TTT TGT CGG AGC CCC GAC GGC GCG GGC AGA CCC TGG 
Gin Arg His Asn Phe Cys Arg Ser Pro Asp Gly Ala Gly Arg Pro Trp 
115 120 125 

TGT TTC TAC GGA GAC GCC CGT GGC AAG GTG GAC TGG GGC TAC TGC GAC 
Cys Phe Tyr Gly Asp Ala Arg Gly Lys Val Asp Trp Gly Tyr Cys Asp 
130 135 140 

TGC AGA CAC GGA TCA GTA CGA CTT CGT GGC GGC AAA AAT GAG TTT GAA 
Cys Arg His Gly Ser Val Arg Leu Arg Gly Gly Lys Asn Glu Phe Glu 
145 150 155 160 

GGC ACA GTG GAA GTA TAT GCA AGT GGA GTT TGG GGC ACT GTC TGT AGC 
Gly Thr Val Glu Val Tyr Ala Ser Gly Val Trp Gly Thr Val Cys Ser 
165 170 175 

AGC CAC TGG GAT GAT TCT GAT GCA TCA GTC ATT TGT CAC CAG CTG CAG 
Ser His Trp Asp Asp Ser Asp Ala Ser Val lie Cys His Gin Leu Gin 
180 185 190 
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CTG GGA GGA AAA GGA ATA GCA AAA CAA ACC CCG TTT TOT GGA CTG GGC 
Leu Gly Gly Lys Gly He Ala Lys Gin Thr Pro Phe Ser Gly Leu Gly 
195 200 205 

CTT ATT CCC ATT TAT TGG AGO AAT GTC CGT TGC CGA GGA GAT GAA GAA 
Leu He Pro He Tyr Trp Ser Asn Val Arg Cys Arg Gly Asp Glu Glu 
210 215 220 

AAT ATA CTG CTT TGT GAA AAA GAC ATC TGG GAG GGT GGG GTG TGT CCT 
Asn He Leu Leu Cys Glu Lys Asp He Trp Gin Gly Gly Val Cys Pro 
225 230 235 240 

GAG AAG ATG GCA GCT GCT GTC ACG TGT AGC TTT TCC CAT GGC CCA ACG 
Gin Lys Met Ala Ala Ala Val Thr Cys Ser Phe Ser His Gly Pro Thr 
245 250 255 

TTC CCC ATC ATT CGC CTT GCT GGA GGC AGC AGT GTG CAT GAA GGC CGG 
Phe Pro He He Arg Leu Ala Gly Gly Ser Ser Val His Glu Gly Arg 
260 255 270 

GTG GAG CTC TAC CAT GCT GGC CAG TGG GGA ACC GTT TGT GAT GAC CAA 
Val Glu Leu Tyr His Ala Gly Gin Trp Gly Thr Val Cys Asp Asp Gin 
275 280 285 

TGG GAT GAT GGC GAT GCA GAA GTG ATC TGC AGG CAG CTG GGC CTC AGT 
Trp Asp Asp Ala Asp Ala Glu Val He Cys Arg Gin Leu Gly Leu Ser 
290 295 300 

GGC ATT GCC AAA GCA TGG CAT CAG GCA TAT TTT GGG GAA GGG TCT GGC 
Gly He Ala Lys Ala Trp His Gin Ala Tyr Phe Gly Glu Gly Ser Gly 
305 310 315 320 

CCA GTT ATG TTG GAT GAA GTA CGC TGC ACT GGG AAT GAG CTT TCA ATT 
Pro Val Met Leu Asp Glu Val Arg Cys Thr Gly Asn Glu Leu Ser He 
325 - 330 335 

GAG CAG TGT CCA AAG AGC TCC TGG GGA GAG CAT AAC TGT GGC CAT AAA 
Glu Gin Cys Pro Lys Ser Ser Trp Gly Glu His Asn Cys Gly His Lys 
340 345 350 

GAA GAT GCT GGA GTG TCC TGT ACC CCT CTA ACA GAT GGG GTC ATC AGA 
Glu Asp Ala Gly Val Ser Cys Thr Pro Leu Thr Asp Gly Val He Arg 
355 360 365 

CTT GCA GGT GGG AAA GGC AGC CAT GAG GGT CGC TTG GAG GTA TAT TAC 
Leu Ala Gly Gly Lys Gly Ser His Glu Gly Arg Leu Glu Val Tyr Tyr 
370 375 380 

AGA GGC CAG TGG GGA ACT GTC TGT GAT GAT GGC TGG ACT GAG CTG AAT 
Arg Gly Gin Trp Gly Thr Val Cys Asp Asp Gly Trp Thr Glu Leu Asn 
385 390 395 400 

ACA TAC GTG GTT TGT CGA CAG TTG GGA TTT AAA TAT GGT AAA CAA GCA 
Thr Tyr Val Val Cys Arg Gin Leu Gly Phe Lys Tyr Gly Lys Gin Ala 
405 410 415 

TCT GCC AAC CAT TTT GAA GAA AGC ACA GGG CCC ATA TGG TTG GAT GAC 
Ser Ala Asn His Phe Glu Glu Ser Thr Gly Pro He Trp Leu Asp Asp 
420 425 430 
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GTC AGC TGC TCA GGA AAG GAA ACC AGA TTT CTT GAG TGT TCC AGG CGA 
Val Ser Cys Ser Gly Lys Glu Thr Arg Phe Leu Gin Cys Ser Arg Arg 
435 440 445 

CAG TOG GGA AGG CAT GAC TGC AGC CAC CGC GAA GAT GTT AGC ATT GCC 
Gin Trp Gly Arg His Asp Cys Ser His Arg Glu Asp Val Ser lie Ala 
450 455 460 

TGC TAG CCT GGC GGC GAG GGA CAC AGG CTC TCT CTG GGT TTT CCT GTC 
Cys Tyr Pro Gly Gly Glu Gly His Arg Leu Ser Leu Gly Phe Pro Val 
465 470 475 480 

AGA CTG ATG GAT GGA GAA AAT AAG AAA GAA GGA CGA GTG GAG GTT TTT 
Arg Leu Met Asp Gly Glu Asn Lys Lys Glu Gly Arg Val Glu Val Phe 
485 490 495 

ATC AAT GGC CAG TGG GGA ACA ATC TGT GAT GAT GGA TGG ACT GAT AAG 
lie Asn Gly Gin Trp Gly Thr lie Cys Asp Asp Gly Trp Thr Asp Lys 
500 505 510 

GAT GCA GCT GTG ATC TGT CGT CAG CTT GGC TAG AAG GGT CCT GCC AGA 
Asp Ala Ala Val lie Cys Arg Gin Leu Gly Tyr Lys Gly Pro Ala Arg 
515 520 525 

GCA AGA ACC ATG GCT TAG TTT GGA GAA GGA AAA GGA CCC ATC CAT GTG 
Ala Arg Thr Met Ala Tyr Phe Gly Glu Gly Lys Gly Pro lie His Val 
530 535 540 

GAT AAT GTG AAG TGC ACA GGA AAT GAG AGG TCC TTG GCT GAC TGT ATC 
Asp Asn Val Lys Cys Thr Gly Asn Glu Arg Ser Leu Ala Asp Cys lie 
545 550 555 560 

AAG CAA GAT ATT GGA AGA CAC AAC TGC CGC CAC AGT GAA GAT GCA GGA 
Lys Gin Asp lie Gly Arg His Asn Cys Arg His Ser Glu Asp Ala Gly 
565 570 575 

GTT ATT TGT GAT TAT TTT GGC AAG AAG GCC TCA GGT AAC AGT AAT AAA 
Val lie Cys Asp Tyr Phe Gly Lys Lys Ala Ser Gly Asn Ser Asn Lys 
580 585 590 

GAG TCC CTC TCA TCT GTT TGT GGC TTG AGA TTA CTG CAC CGT CGG CAG 
Glu Ser Leu Ser Ser Val Cys Gly Leu Arg Leu Leu His Arg Arg Gin 
595 600 605 

AAG CGG ATC ATT GGT GGG AAA AAT TCT TTA AGG GGT GGT TGG CCT TGG 
Lys Arg lie lie Gly Gly Lys Asn Ser Leu Arg Gly Gly Trp Pro Trp 
610 615 620 

CAG GTT TCC CTC CGG CTG AAG TCA TCC CAT GGA GAT GGC AGG CTC CTC 
Gin Val Ser Leu Arg Leu Lys Ser Ser His Gly Asp Gly Arg Leu Leu 
625 630 635 640 

TGC GGG GCT ACG CTC CTG AGT AGC TGC TGG GTC CTC ACA GCA GCA CAC 
Cys Gly Ala Thr Leu Leu Ser Ser Cys Trp Val Leu Thr Ala Ala His 
645 650 655 

TGT TTC AAG AGG TAT GGC AAC AGC ACT AGG AGC TAT GCT GTT AGG GTT 
Cys Phe Lys Arg Tyr Gly Asn Ser Thr Arg Ser Tyr Ala Val Arg Val 
660 665 670 
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GGA GAT TAT CAT ACT CTG GTA CCA GAG GAG TTT GAG GAA GAA ATT GGA 
Gly Asp Tyr His Thr Leu Val Pro Glu Glu Phe Glu Glu Glu He Gly 
675 680 685 

GTT CAA CAG ATT GTG ATT CAT CGG GAG TAT CGA CCC GAC CGC AGT GAT 
Val Gin Gin He Val He His Arg Glu Tyr Arg Pro Asp Arg Ser Asp 
690 695 700 

TAT GAC ATA GCC CTG GTT AGA TTA CAA GGA CCA GAA GAG CAA TGT GCC 
Tyr Asp He Ala Leu Val Arg Leu Gin Gly Pro Glu Glu Gin Cys Ala 
705 710 715 720 

AGA TTC AGC AGC CAT GTT TTG CCA GCC TGT TTA CCA CTC TGG AGA GAG 
Arg Phe Ser Ser His Val Leu Pro Ala Cys Leu Pro Leu Trp Arg Glu 
725 730 735 

AGG CCA CAG AAA ACA GCA TCC AAC TGT TAC ATA ACA GGA TGG GGT GAC 
Arg Pro Gin Lys Thr Ala Ser Asn Cys Tyr He Thr Gly Trp Gly Asp 
740 745 750 

ACA GGA CGA GCC TAT TCA AGA ACA CTA CAA CAA GCA GCC ATT CCC TTA 
Thr Gly Arg Ala Tyr Ser Arg Thr Leu Gin Gin Ala Ala He Pro Leu 
755 760 765 

CTT CCT AAA AGG TTT TGT GAA GAA CGT TAT AAG GGT CGG TTT ACA GGG 
Leu Pro Lys Arg Phe Cys Glu Glu Arg Tyr Lys Gly Arg Phe Thr Gly 
770 775 780 

AGA ATG CTT TGT GCT GGA AAC CTC CAT GAA CAC AAA CGC GTG GAC AGC 
Arg Met Leu Cys Ala Gly Asn Leu His Glu His Lys Arg Val Asp Ser 
785 790 795 800 

TGC CAG GGA GAC AGC GGA GGA CCA CTC ATG TGT GAA CGG CCC GGA GAG 
Cys Gin Gly Asp Ser Gly Gly Pro Leu Met Cys Glu Arg Pro Gly Glu 
805 810 815 

AGC TGG GTG GTG TAT GGG GTG ACC TCC TGG GGG TAT GGC TGT GGA GTC 
Ser Trp Val Val Tyr Gly Val Thr Ser Trp Gly Tyr Gly Cys Gly Val 
820 825 830 

AAG GAT TCT CCT GGT GTT TAT ACC AAA GTC TCA GCC TTT GTA CCT TGG 
Lys Asp Ser Pro Gly Val Tyr Thr Lys Val Ser Ala Phe Val Pro Trp 
835 840 845 

ATA AAA AGT GTC ACC AAA CTG TAA TTCTTCATGG AAACTTCAAA GCAGCATTT 
He Lys Ser Val Thr Lys Leu * 
850 855 



AAACAAATGG AAAACTTTGA ACCCCCACTA TTAGCACTCA GCAGAGATGA CAACAAATGG 27 60 

CAAGATCTGT TTTTGCTTTG TGTTGTGGTA AAAAATTGTG TACCCCCTGC TGCTTTTGAG 2 82 0 

AAATTTGTGA ACATTTTCAG AGGCCTCAGT GTAGTGGAAG TGATAATCCT TAAATGAACA 28 80 

TTTTCTACCC TAATTTCACT GGAGTGACTT ATTCTAAGCC TCATCTATCC CCTACCTATT 2940 
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TCTCAAAATC ATTCTATGCT GATTTTACAA AAGATCATTT TTACATTTGA ACTGAGAACC 3 000 

CCTTTTAATT GAATCAGTGG TGTCTGAAAT CATATTAAAT ACCCACATTT GACATAAATG 3 0 50 

CGGTACCCTT TACTACACTC ATGAGTGGCA TATTTATGCT TAGGTCTTTT CAAAAGACTT 3120 

GACAAGAAAT CTTCATATTC TCTGTAGCCT TTGTCAAGTG AGGAAATCAG TGGTTAAAGA 3180 

ATTCCACTAT AAACTTTTAG GCCTGAATAG GAGTAGTAAA GCCTCAAGGA CATCTGCCTG 3240 

TCACAATATA TTCTCAAAGT GATCTGATAT TTGGAAACAA GTATCCTTGT TGAGTACCAA 3 3 00 

GTGCTACAGA AACCATAAGA TAAAAATACT TTCTACCTAC AGCGTGCCCG 33 50 
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(1) INFORMATION ABOUT THE COMPOUND OF THE FORMULA II (Neurotrvpsin 
of the mouse) 



(i) SEQUENCE CHARACTERISTICS: 

5 

(A) LENGTH: 2376 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: single strand 

(D) TOPOLOGY: linear 

10 

(11) MOLECULE TYPE: cDNA to mRNA 



(vi) ORIGINAL SOURCE: 



15 (A) ORGANISM: Mus musculus 

(D) DEVELOPMENT STAGE: postnatal day 1 0 

(F) TISSUE TYPE: brain 

(G) CELL TYPE: neurons 



(vii) IMMEDIATE SOURCE: 

(A) LIBRARY: mouse brain cDNA library in the lambda Uni-ZAP-XR vector, oligo 

(dT)-primed, from Balb c mice, postnatal day 20, 
Cat. No.. 937 319; Stratagene, La Jolla, CA, USA 



(B) CLONE: cDNA clone no. 16 



(vii) IMMEDIATE SOURCE: 

30 

(A) LIBRARY: mouse brain cDNA library in the Lambda gt1 0 vector, 

oligo(dT)- and random-primed, embryonic day 1 5, 
Cat. No. ML 3002a; Clontech, Palo Alto, CA, USA 



35 (B) CLONE: cDNA clone #25 
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(ix) FEATURE: 

(A) NAME/KEY: signal peptide 

(B) LOCATION: 24 .. 86 



(ix) FEATURE: 



10 (A) NAME/KEY: mature peptide 
(B) LOCATION: 87 .. 2306 



(ix) FEATURE: 

15 

(A) NAME/KEY: coding sequence 

(B) LOCATION: 24 .. 2306 



20 (ix) FEATURE: 

(A) NAME/KEY: proline-rich, basic segment 

(B) LOCATION: 90 .. 275 

25 

(ix) FEATURE: 

(A) NAME/KEY: Kringle domain 

(B) LOCATION: 276 .. 494 

30 

(ix) FEATURE: 

(A) NAME/KEY: SRCR domain 1 

35 (B) LOCATION: 519 .. 824 
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(ix) 



FEATURE: 



5 (A) 
(B) 



NAME/KEY: SRCR domain 2 



LOCATION: 840 ..1142 



(ix) 



FEATURE: 



10 



(A) NAME/KEY: SRCR domain 3 

(B) LOCATION: 1 179 .. 1484 



15 (ix) FEATURE: 

(A) NAME/KEY: proteolytic domain 

(B) LOCATION: 1536.. 2306 



(ix) FEATURE: 

(A) NAME/KEY: histidine of the catalytic triade 

(B) LOCATION: 1707 .. 1709 

25 

(ix) FEATURE: 

(A) NAME/KEY: aspartic acid of the catalytic triade 

30 (B) LOCATION: 1857 .. 1859 

(ix) FEATURE: 

35 (A) NAME/KEY: serine of the catalytic triade 



20 
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(B) LOCATION: 21 54 .. 21 56 

(ix) FEATURE: 

5 (A) NAME/KEY:polyA signal 

(B) LOCATION: 2324 .. 2329 and 2331 2336 

(ix) FEATURE: 

10 (A) NAME/KEY: polyA segment 

(B) LOCATION: 2357 .. 2376 

(ix) FEATURE: 



(A) NAME/KEY: 3"UTR 

(B) LOCATION: 2307 .. 2341 or 2307 .. 2356 



15 



20 



(ix) 



FEATURE: 



(A) 
(B) 



NAME/KEY: 5"UTR 



LOCATION: 1 .. 23 
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Coumpound of the formula il (neurotrypsin of the mouse) 



GGACCACACT CGGCGCCGCA GCC ATG GCG CTC GCC CGC TGC GTG CTG GOT GTG 
Met Ala Leu Ala Arg Cys Val Leu Ala Val 



ATT TTA GGG GCA CTG TCT GTA GTG GCC CGC OCT GAT CCG GTC TCG CGC 
lie Leu Gly Ala Leu Ser Val Val Ala Arg Ala Asp Pro Val Ser Arg 



TCT CCC CTT CAC CGC CCG CAT CCG TCC CCA CCG CGT TCC CAA CAC GCG 
Ser Pro Leu His Arg Pro His Pro Ser Pro Pro Arg Ser Gin His Ala 



CAC TAC CTT CCC AGC TCG CGG CGG CCA CCC AGO ACC CCG CGC TTC CCG 
His Tyr Leu Pro Ser Ser Arg Arg Pro Pro Arg Thr Pro Arg Phe Pro 



CTC CCG CTG CGG ATC CCC GCT GCC CAG CGC CCG CAG GTC CTC AGC ACC 
Leu Pro Leu Arg lie Pro Ala Ala Gin Arg Pro Gin Val Leu Ser Thr 



GGG CAC ACG CCC CCG ACG ATT CCA CGC CGC TGC GGG GCA GGA GAG TCG 
Gly His Thr Pro Pro Thr lie Pro Arg Arg Cys Gly Ala Gly Glu Ser 



TGG GGC AAT GCC ACC AAC CTC GGC GTC CCG TGT CTA CAC TGG GAC GAG 
Trp Gly Asn Ala Thr Asn Leu Gly Val Pro Cys Leu His Trp Asp Glu 



GTG CCG CCC TTC CTG GAG CGG TCG CCC CCG GCC AGT TGG GCT GAG CTG 
Val Pro Pro Phe Leu Glu Arg Ser Pro Pro Ala Ser Trp Ala Glu Leu 
90 95 100 

CGA GGG CAG CCG CAC AAC TTC TGC CGG AGC CCG GAT GGC TCG GGC AGA 
Arg Gly Gin Pro His Asn Phe Cys Arg Ser Pro Asp Gly Ser Gly Arg 
105 110 115 

CCT TGG TGC TTC TAT CGG AAT GCC CAG GGC AAA GTA GAC TGG GGC TAC 
Pro Trp Cys Phe Tyr Arg Asn Ala Gin Gly Lys Val Asp Trp Gly Tyr 
120 125 130 

TGC GAT TGT GGT CAA GGC CCG GCG TTG CCC GTC ATT CGC CTT GTT GGT 
Cys Asp Cys Gly Gin Gly Pro Ala Leu Pro Val lie Arg Leu Val Gly 
135 140 145 

GGG AAC AGT GGG CAT GAA GGT CGA GTG GAG CTG TAC CAC GCT GGC CAG 
Gly Asn Ser Gly His Glu Gly Arg Val Glu Leu Tyr His Ala Gly Gin 
150 155 160 165 

TGG GGG ACC ATC TGT GAC GAC CAA TGG GAC AAT GCA GAC GCA GAC GTC 
Trp Gly Thr lie Cys Asp Asp Gin Trp Asp Asn Ala Asp Ala Asp Val 
170 175 180 

ATC TGT AGG CAG CTG GGG CTC AGT GGC ATT GCC AAA GCA TGG CAT CAG 
lie Cys Arg Gin Leu Gly Leu Ser Gly lie Ala Lys Ala Trp His Gin 
185 190 195 
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GCA CAT TTT GGG GAA GGA TOT GGC CCA ATA TTG TTG GAT GAA GTA CGC 
Ala Hls Phe Gly Glu Gly Ser Gly Pro lie Leu Leu Asp Glu Val Arg 
200 205 210 

TGC ACC GGA AAC GAG CTG TCA ATT GAG CAA TGT CCA AAG AGT TCC TGG 
Cys Thr Gly Asn Glu Leu Ser lie Glu Gin Cys Pro Lys Ser Ser Trp 
215 220 225 

GGC GAA CAT AAC TGT GGC CAT AAA GAA GAT GCT GGA GTG TCT TGT GTT 
Gly Glu His Asn Cys Gly His Lys Glu Asp Ala Gly Val Ser Cys Val 
230 235 240 245 

CCT CTA ACA GAT GGT GTC ATC AGA CTG GCA GGA GGA AAA AGT ACC CAT 
Pro Leu Thr Asp Gly Val lie Arg Leu Ala Gly Gly Lys Ser Thr His 
250 255 260 

GAA GGT CGC CTG GAG GTC TAC TAG AAG GGG GAG TGG GGG ACA GTC TGT 
Glu Gly Arg Leu Glu Val Tyr Tyr Lys Gly Gin Trp Gly Thr Val Cys 
265 270 275 

GAT GAT GGC TGG ACT GAG ATG AAC ACA TAC GTG GCT TGT CGA CTG CTG 
Asp Asp Gly Trp Thr Glu Met Asn Thr Tyr Val Ala Cys Arg Leu Leu 
280 285 290 

GGA TTT AAA TAC GGC AAA CAG TCC TCT GTG AAC CAT TTT GAT GGC AGO 
Gly Phe Lys Tyr Gly Lys Gin Ser Ser Val Asn His Phe Asp Gly Ser 
295 300 305 

AAC AGG CCC ATA TGG CTG GAT GAC GTC AGC TGC TCA GGA AAA GAA GTC 
Asn Arg Pro lie Trp Leu Asp Asp Val Ser Cys Ser Gly Lys Glu Val 
310 315 320 325 

AGC TTC ATT CAG TGT TCC AGG AGA CAG TGG GGA AGG CAT GAC TGC AGC 
Ser Phe lie Gin Cys Ser Arg Arg Gin Trp Gly Arg His Asp Cys Ser 
330 335 340 

CAT AGA GAA GAT GTG GGC CTC ACC TGC TAT CCT GAC AGC GAT GGA CAT 
His Arg Glu Asp Val Gly Leu Thr Cys Tyr Pro Asp Ser Asp Gly His 
345 350 355 

AGG CTT TCT CCA GGT TTT CCC ATC AGA CTA GTG GAT GGA GAG AAT AAG 
Arg Leu Ser Pro Gly Phe Pro lie Arg Leu Val Asp Gly Glu Asn Lys 
360 365 370 

AAG GAA GGA CGA GTG GAG GTT TTT GTC AAT GGC CAA TGG GGA ACA ATC 
Lys Glu Gly Arg Val Glu Val Phe Val Asn Gly Gin Trp Gly Thr He 
375 380 385 

TGC GAT GAC GGA TGG ACC GAT AAG CAT GCA GCT GTG ATC TGC CGG CAA 
Cys Asp Asp Gly Trp Thr Asp Lys His Ala Ala Val He Cys Arg Gin 
390 395 400 405 

CTT GGC TAT AAG GGT CCT GCC AGA GCA AGG ACT ATG GCT TAT TTT GGG 
Leu Gly Tyr Lys Gly Pro Ala Arg Ala Arg Thr Met Ala Tyr Phe Gly 
410 415 420 

GAA GGA AAA GGC CCC ATC CAC ATG GAT AAT GTG AAG TGC ACA GGA AAT 
Glu Gly Lys Gly Pro He His Met Asp Asn Val Lys Cys Thr Gly Asn 
425 430 435 
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GAG AAG GCC CTG GCT GAC TGT GTC AAA CAA GAG ATT GGA AGG CAC AAC 
Glu Lys Ala Leu Ala Asp Cys Val Lys Gin Asp lie Gly Arg His Asn 
440 445 450 

TGC CGC CAC AGT GAG GAT GCA GGA GTC ATC TGT GAC TAT TTA GAG AAG 
Cys Arg His Ser Glu Asp Ala Gly Val lie Cys Asp Tyr Leu Glu Lys 
455 450 465 

AAA GCA TCA AGT AGT GGT AAT AAA GAG ATG CTC TCA TCT GGA TGT GGA 
Lys Ala Ser Ser Ser Gly Asn Lys Glu Met Leu Ser Ser Gly Cys Gly 
470 475 480 485 

CTG AGG TTA CTG CAC CGT CGG GAG AAA CGG ATC ATT GGT GGG AAC AAT 
Leu Arg Leu Leu His Arg Arg Gin Lys Arg lie lie Gly Gly Asn Asn 
490 495 500 

TCT TTA AGG GGT GCC TGG CCT TGG CAG GCT TCC CTC AGG CTG AGG TOG 
Ser Leu Arg Gly Ala Trp Pro Trp Gin Ala Ser Leu Arg Leu Arg Ser 
505 510 515 

GCC CAT GGA GAC GGC AGG CTG CTT TGT GGA GCT ACC CTT CTG AGT AGC 
Ala His Gly Asp Gly Arg Leu Leu Cys Gly Ala Thr Leu Leu Ser Ser 
520 525 530 

TGC TGG GTC CTG ACA GCT GCA CAC TGC TTC AAA AGG TAC GGA AAC AAC 
Cys Trp Val Leu Thr Ala Ala His Cys Phe Lys Arg Tyr Gly Asn Asn 
535 540 545 

TCG AGG AGC TAT GCA GTT CGA GTT GGG GAT TAT CAT ACT CTG GTC CCA 
Ser Arg Ser Tyr Ala Val Arg Val Gly Asp Tyr His Thr Leu Val Pro 
550 555 560 565 

GAG GAG TTT GAA CAA GAA ATA GGG GTT CAA CAG ATT GTG ATT CAC AGG 
Glu Glu Phe Glu Gin Glu He Gly Val Gin Gin He Val He His Arg 
570 575 580 

AAC TAC AGG CCA GAC AGA AGC GAC TAT GAC ATT GCC CTG GTT AGA TTG 
Asn Tyr Arg Pro Asp Arg Ser Asp Tyr Asp He Ala Leu Val Arg Leu 
585 590 595 

CAA GGA CCA GGG GAG CAA TGT GCC AGA CTA AGC ACC CAC GTT TTG CCA 
Gin Gly Pro Gly Glu Gin Cys Ala Arg Leu Ser Thr His Val Leu Pro 
600 605 610 

GCC TGT TTA CCT CTA TGG AGA GAG AGG CCA CAG AAA ACA GCC TCC AAC 
Ala Cys Leu Pro Leu Trp Arg Glu Arg Pro Gin Lys Thr Ala Ser Asn 
615 620 625 

TGT CAC ATA ACA GGA TGG GGA GAC ACA GGT CGT GCC TAC TCA AGA ACT 
Cys His He Thr Gly Trp Gly Asp Thr Gly Arg Ala Tyr Ser Arg Thr 
630 635 540 645 

CTA CAA CAA GCT GCT GTG CCT CTG TTA CCC AAG AGG TTT TGT AAA GAG 
Leu Gin Gin Ala Ala Val Pro Leu Leu Pro Lys Arg Phe Cys Lys Glu 
650 655 660 

AGG TAC AAG GGA CTA TTT ACT GGG AGA ATG CTC TGT GCT GGG AAC CTC 
Arg Tyr Lys Gly Leu Phe Thr Gly Arg Met Leu Cys Ala Gly Asn Leu 
665 670 675 
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CAA GAA GAC AAC CGT GTG GAG AGO TGC CAG GGA GAC AGT GGA GGA CCA 2165 
Gin Glu Asp Asn Arg Val Asp Ser Cys Gin Gly Asp Ser Gly Gly Pro 
680 685 690 

CTC ATG TGT GAA AAG CCT GAT GAG TCC TGG GTT GTG TAT GGG GTG ACT 2213 
Leu Met Cys Glu Lys Pro Asp Glu Ser Trp Val Val Tyr Gly Val Thr 
695 700 705 

TCC TGG GGG TAT GGA TGT GGA GTC AAA GAC ACT CCT GGA GTT TAT ACC 22 61 

Ser Trp Gly Tyr Gly Cys Gly Val Lys Asp Thr Pro Gly Val Tyr Thr 
710 715 720 725 

AGA GTC CCC GOT TTT GTA CCT TGG ATA AAA AGT GTC ACC AGT CTG 23 06 

Arg Val Pro Ala Phe Val Pro Trp lie Lys Ser Val Thr Ser Leu 
730 735 740 

TAACTTATGG AAAGCTCAAG AAATAGTAAA ACAGTAACTA TTCAGTCTTC AAAAAAAAAA 2366 

AAAAAAAAAA 2 37 6 
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Patent claims 

1 . Neurotrypsins of the formulas 1 and I! 

5 i: neurotrypsin of the human 

II: neurotrypsin of the mouse 

2. Neurotrypsin according to claim 1 , characterized in that the compounds of the 
10 formulas I or II comprise the separate, coding nucleotide sequences and the 

coded amino acid sequences of the compounds of the formulas I or II. 

3. Use of the coding nucleotide sequences of the compounds of the formulas ! or II 
15 for the production of recombinant proteins. 

" " 4. Use of proteins with the coded amino acid sequences of the compounds of the 
formulas I or II as targets for the development of pharmaceutical drugs, for 

20- example for the inhibition or the enhancement of the catalytic activity of the 

coded proteins of the formulas 1 or II. 

5. Use of the species-homologous proteins of the compounds of the formulas 1 or 11 
25 as targets for the development of pharmaceutical drugs, for example for the 

enhancement or the inhibition of the catalytic activity of the coded proteins of the 
formulas I or II. 

30 6. Use of the proteins with the coded amino acid sequences of the compounds of 
the formulas I or II for the spatial structure determination, for example the spatial 
structure determination by means of crystallography or nuclear resonance 
spectroscopy. 
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7. Use of the coded amino acid sequences of the compounds of the formulas ! or II 
for the prediction of the protein structure by means of computerized protein 
structure prediction methods. 

8. Use of the spatial structure of the coded amino acid sequences of the 
compounds of the formulas 1 or II as targets for the development of 
pharmaceutical drugs, for example for the inhibition or the enhancement of the 
catalytic activity of the coded proteins of the compounds of the formulas I or II. 

10 

9. Use of the coding nucleotide sequences of the compounds of the formulas 1 or II 
in gene therapeutical applications in humans and in animals, as for example as 
parts of gene therapy vectors or as for example as parts of artificial 

15 chromosomes. 

10. Use of the compounds of the formulas I or II for so-called cell engineering 
afjplications for the production of gene technologically mutated cells, which 

20 produce the coded sequences. 

1 1 . Use of the coded amino acid sequences of the compounds of the formulas I or 11 
as antigens for the production of antibodies, as for example antibodies that inhibit 

25 or promote the protease function or antibodies that can be used for 

immunohistochemical studies. 

12. Use of the coding nucleotide sequences of the compounds of the formulas I or II 
30 for the production of transgenic animals, as for example transgenic mice 

13. Use of the coding nucleotide sequences of the compounds of the formulas I or li 
for the inactivation or the mutation of the corresponding gene by means of gene 
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targeting techniques, as for example the elimination of the gene in the mouse 
through homologous recombination 

5 14. Use of the compounds of the formulas I or II for the diagnostics of disorders in 
the gene corresponding to the compound of the formula I. 

15. Use of the coding nucleotide sequences of the compounds of the formulas I or 11 
10 as a starting sequence for gene technological modifications aimed at the 

production of pharmaceutical compositions or gene therapy vectors which exhibit 
changed properties as compared with the corresponding pharmaceutical 
compositions or gene therapy vectors containing the coding nucleotide sequence 
of the compounds of formulas I or II, for example changed proteolytic activity, 
15 changed proteolytic specificity, or changed pharmacokinetic characteristics. 
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COMBINED DECLARATION FOR PATENT APPLICATION AND POWER OF ATTORNEY 
(includes Reference to Provisional and POT International Applications) 



Attorney's Docket No. 
030708-035 



As a below named inventor, I hereby declare liiat: 

My residence, post office address and citizenship are as stated below next to my name; 

I believe I am tiie original, first and sole inventor (if only one name is listed below) or an original, first and joint inventor (if plural 
names are listed below) of the subject matter which is claimed and for which a patent is sought on the invention entitled: 

NEUROTRYPSIN 



the specification of which (check only one item below): 
CH is attached hereto. 
□ was filed as United States application 




and was amended 



El was filed as PCT international application 
Number PCT/IB98/00625 



(if applicable). 



on April 24. 1998 



and was amended 



(if applicable). 



I hereby state that I have reviewed and understand the contents of die above-identified specification, including the claims, as 
amended by any amendment referred to above. 

I acknowledge the duty to disclose to the Office all information known to me to be material to patentability as defined in Title 37, 
Code of Federal Regulations, §1.56. 

I hereby claim foreign priority benefits under Title 35, United States Code, §119 (a)-(e) of any foreign application(s) for patent or 
inventor's certificate or of any PCT international application(s) designatiag at least one country other than the United States of 
America listed below and have also identified below any foreign application(s) for patent or inventor's certificate or any PCT 
■ temational application(s) designating at least one country other than the United States of America filed by me on the same 
tbject matter having a filing date before that of the application(s) of which priority is claimed: 



^t 



PRIOR FOREIGN/PCT APPLICATION(S) AND ANY PRIORITY CLAIMS UNDER 35 U.S.C. §119: 



APPLICATION NUMBER 



26 April 1997 



_Yes 



.No 



I hereby claun the benefit under Title 35, United States Code § 119(e) of any United States provisional application(s) listed below. 



(Application Number) 



(Filing Date) 



(Application Number) 
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COMBINED DECLARATION FOR PATENT APPLICATION AND POWER OF ATTORNEY (CONTINUED) 
(Includes Reference to Provisional and PCT international Applications) 



Attorney's Docket No. 
030708-035 



I hereby claim the benefit under Title 35, United States Code, §120 of any United States applications{s) or PCT international 
application(s) designating the United States of America that is/are listed below and, insofar as the subject matter of each of the 
claims of this application is not disclosed m that/those prior application(s) in the manner provided by the first paragraph of Title 
35, United States Code, § 1 12, 1 acknowledge the duty to disclose to the Office all information known to me to be material to the 
patentability as defined in Title 37, Code of Federal Regulations §1.56, which became available between the filmg date of the prior 
application(s) and the national or PCT mtemational filing date of this application: 



U.S. APPLICATIONS 


STATUS (check one) 


U.S. APPLICATION NUMBER 


U.S. FILING DATE 


PATENTED 


PENDING 


ABANDONED 
































PCT APPLICATIONS DESIGNATING THE U.S. 








^ PCT APPLICATION NO. 


PCT FILING DATE 


U.S. APPLICATION NUMBERS 
ASSIGNED (if any) 













































I hereby appomt the foUowmg attorneys and agent(s) to prosecute said application and to transact all busmess in the Patent and 
Trademark Office connected therewith and to file, prosecute and to transact all business in connection with international applications 
directed to said invention: 



William L. Mafliis 


17,337 


R. Danny Huntington 


27,903 


Gerald F. Swiss 


30,113 


Robert S. Swecker 


19,885 


Eric H. Weisblatt 


30,505 


Michael J. Ure 


33,089 


Platon N. Mandros 


22,124 


James W. Peterson 


26,057 


Charles F. Wieland m 


33,096 


Benton S. Duffett, Jr. 


22,030 


Teresa Stanek Rea 


30,427 


Bruce T. Wieder 


33,815 


Norman H. Stepno 


22,716 


Robert E. Krebs 


25,885 


Todd R. Walters 


34,040 


Ronald L. Grudziecki 


24,970 


William C. Rowland 


30,888 


RoMoiS. JUIions 


31,979 


Frederick G. Michaud, Jr. 


26,003 


T. Gene Dillahunty 


25,423 


Harold R. Brown m 


36,341 


Alan E. Kopecki 


25,813 


Patrick C. Keane 


32,858 


Allen R. Baum 


36,086 


Regis E. Slutter 


26,999 


Bruce J. Boggs, Jr. 


32,344 


Steven M. du Bois 


35,023 


Samuel C. MiUer, HI 


27,360 


William H. Benz 


25,952 


Brian P. O'Shaughnessy 


32,747 


Robert G. Mukai 


28,531 


Peter K. Skiff 


31,917 




(George A. Hovanec, Jr. 
frames A. LaBarre 


28,223 
28,632 


Richard J. McGrath 
Matthew L. Schneider 


29,195 
32,814 


lillillllillilillllllll 




E. Joseph Gess 


28,510 


Michael G. Savage 


32,596 


21833 





Address all correspondence to: 



21839 



William L. Mathis 

Burns, Doane, Swecker & Mathis, L.L.P. 
P.O. Box 1404 

Alexandria, Virgmia 22313-1404 



Address all telephone calls to: Bruce J. Boggs, Jr. 



at (703) 836-6620. 

I hereby declare that all statements made herem of my own knowledge are true and that ah statements made on information and 
belief are believed to be true; and further that diese statements were made with the knowledge that willful false statements and the 
like so made are punishable by fine or imprisonment, or both, under Section 1001 of Title 18 of the United States Code and that 
such willful false statements may jeopardize the validity of the application or any patent issued thereon. 
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COMBINED DECLARATION FOR PATENT APPLICATION AND POWER OF ATTORNEY (CONTINUED) 
(Includes Reference to Provisional and PCT Internationa! Applications) 


Attorney's Docket No. 
030708-035 


FULL NAME OF SOLE OR FIRST INVENTOR 
Peter SONDEREGGER 


sign^TOre / ^ 




RESIDENCE ' ' 
Zurich, Switzerland { X 


crrizENSffli 

Swiss 




POST OFFICE ADDRESS 

Biochemisches Institut Universitat Zurich, Winterthurerstr. 190, CH8057 Zurich, Switzerland 


FULL NAME OF SECOND JOINT INVENTOR, IF ANY 


SIGNATURE 1 DATE 


RESIDENCE 


CITIZENSHIP 


POST OFFICE ADDRESS 


FULL NAME OF THIRD JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHII 




POST OFFICE ADDRESS 


^nJL NAME OF FOURTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHII 




POST OFFICE ADDRESS 


FULL NAME OF FIFTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CrriZENSHD 




POST OFFICE ADDRESS 


FULL NAME OF SIXTH JOINT INVENTOR, IF ANY 


SIGNATURE 1 DATE 


RESIDENCE 


CITIZENSHIP 


POST OFFICE ADDRESS 


WLL name of SEVENTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHIP 


POST OFHCE ADDRESS 


FULL NAME OF EIGHTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHIP 


POST OFFICE ADDRESS 


FULL NAME OF NINTH JOINT INVENTOR, IF ANY 


SIGNATURE 


DATE 


RESIDENCE 


CITIZENSHIP 


POST OFFICE ADDRESS 
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